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No. 4 ESS: 


Prologue 


By K. E. MARTERSTECK 
(Manuscript received July 30, 1980) 


Since the cutover of the first No. 4 Electronic Switching System 
(Ess) office in January 1976, a program of system evolution has been 
carried out. As a result significant cost reduction was achieved, 
major new features have been added to the No. 4 Ess, and the system’s 
performance has been enhanced. This paper gives an overview of the 
continuing development of the No. 4 Ess carried out during the period 
1976 to 1980 as a prologue to a series of papers in this volume which 
describe some of the specifics of this activity. 


1. BACKGROUND 


In January, 1976, the first No. 4 Electronic Switching System (Ess) 
was placed in service in Chicago, Illinois. This culminated the largest 
single system development ever undertaken in the Bell System, a 
development which produced a high-capacity toll and tandem digital 
switching system. The No. 4 Ess, with its powerful central processor 
and time division network, brought to the Bell System telecommuni- 
cations network significant improvements in flexibility, reliability, and 
economy compared with its electromechanical predecessors. 

However, except for the large increase in capacity, the basic toll- 
switching features of the early No. 4 ESS machines were essentially a 
very modern version of the No. 4A crossbar features. Therefore, the 
development of the No. 4 Ess did not end with the cutover of the 
Chicago 7 office. Instead, a program of evolution of the system planned 
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to achieve significant cost reduction, as well as feature additions and 
performance improvements, was vigorously pursued. Stimulated by 
dramatic advances in integrated circuit technology and improved 
software design and circuit interconnection techniques, the No. 4 Ess 
was virtually completely redesigned in the period from 1976 to 1980. 
Five major new versions of the software package, called generics, along 
with new hardware designs, were introduced on roughly yearly inter- 
vals during this period. These generics added some additional toll 
functions and major new network revenue-producing features; pro- 
duced reductions in cost, power, and space; and improved reliability 
and maintainability. 


ll. INITIAL SYSTEM IMPLEMENTATION 


The 1976 architecture of No. 4 Ess is shown in simplified form in 
Fig. 1.’ In this original design, analog signals were converted to voice- 
band by transmission terminal equipment and then converted to pulse- 
code modulated (pcm) signals and multiplexed into Ds-120 streams in 
the Voiceband Interface Frame (viF). A Signal Processor Type 1 (SP1) 
was connected to the E and M leads from/to the transmission terminal 
equipment. The spl detected and interpreted state changes on the E 
lead and generated appropriate state changes on the M lead. Digital 
carrier (T1) signals entered the system at the digroup terminal (DT), 
where these signals were multiplexed into the ps-120 format. The 
signal processor type 2 (sp2) derived supervisory states from the 
incoming PCM stream and generated supervisory states in the transmit 
direction for inclusion in the outgoing PcM stream. 

The No. 4 Ess time-division switching network contains six stages of 
time-shared switching: time-space-space-space-space-time. The first 
and last pairs of switching stages are implemented in the time-slot- 
interchange frame (TsI). The TsI also performs a decorrelating function 
by which the Pc signals in successive time slots in each incoming Ds- 
120 digital stream are spread in both time and space as a result of the 
switching action. This ensures spreading of traffic over the network 
and eliminates the need for load balancing. The middle two stages of 
time-shared space switching are provided by the time-multiplexed 
switch (TMs). The pattern of network connections in this 1024-by-1024 
switch is changed 1.024 x 10° times every second. The No. 4 ESS is 
internally synchronized by an extremely precise and reliable clocking 
system consisting of four crystal-controlled oscillators operating at 
16.384 MHz. In normal operation, one oscillator is designated the 
master, supplying the timing pulses for the network, while the remain- 
ing three oscillators are in standby, phased-locked to the master. 

The entire No. 4 Ess is under the control of the powerful 1A 
Processor, which has a central control and three memory types: 
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program store, call store, and file store. The program store contains 
the fixed set of software instructions. The call store contains the time- 
variant data associated with setting up calls and handling other office 
activities. The call store also contains translation data describing the 
office configuration and prescribing the office-specific call-handling 
parameters. The file store on disk is used primarily as a backup for the 
program store and the fixed data in call store. The file store also 
contains less-frequently used programs, such as diagnostic routines. 
The original program and call stores were coincident-current ferrite- 
core arrays with fetch cycles of 1400 ns, twice the 700-ns cycle time of 
the central control. 


Ill. SYSTEM EVOLUTION 


The continuing development of the No. 4 Ess affected both the 
software and hardware architecture of the system. When the No. 4 Ess 
was first placed into service, its complement of software consisted of 
approximately 1.4 x 10° stored words in over 800 identifiable functional 
units called PIDENTs, written primarily in a language called EPL (ESS 
programming language). EPL is an “intermediate level” language, more 
powerful than assembly language, but not as powerful as a “high level” 
language. EPL, together with a set of macros, offers designers a degree 
of mechanization while maintaining tight control over the use of real 
time—a matter of constant concern because of the very high switching 
capacity requirements. However, to increase the productivity of the 
software development staff by taking advantage of more modern 
software technology, a high-level language for switching, called EPLx, 
(ESS programming language extra) was developed for use in the No. 4 
ESS. EPLX has modern control constructs and high-level data descrip- 
tion and data structure reference capabilities. Since the EPLX compiler 
produces somewhat less efficient code in terms of memory and real- 
time consumption, when these factors are paramount the designer can 
still mix EPL with the EPLX on a module-by-module basis. 

An extensive amount of new software has been written for No. 4 Ess 
to introduce new call-processing features, expanded administrative 
capabilities, and improved maintenance and fault-recovery operations. 
As part of the development of the new international switching func- 
tions using ccITT No. 5 and No. 6 signaling, the call-processing software 
was restructured to improve its flexibility and maintainability. Over 
the last several generics the maintenance and fault-recovery software 
was modularized and restructured under a special operating system to 
facilitate the inclusion of new peripheral frames into the system. This 
new software was written in EPLx. Also, with each new peripheral 
frame type introduced into No. 4 Ess, a considerable quantity of new 
fault-recovery, maintenance, and diagnostic software was written. Vir- 
tually all areas of administrative function, including recent change- 
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and-verify, trunk-maintenance, network-management, report-genera- 
tion, and operations-support-system interfaces, have been augmented 
and expanded. In addition to the international call-switching functions, 
major new functions such as Common Channel Interoffice Signaling 
Inward Wide Area Telephone Service, and the Mass Announcement 
System have been added. As a result of all this software development 
activity, the 1980 No. 4 Ess generic contains in excess of 2.1 x 10° 
words of software program. 

Between 1976 and 1980, the evolution of No. 4 ESS hardware has 
been comparably extensive (Fig. 2). In the 1A Processor, the memory 
was upgraded from the original ferrite-core memory arrays to semi- 
conductor memories. The transition was accomplished in two stages. 
In 1977, metal oxide semiconductor (Mos) integrated-circuit memories 
in 65,536-word modules were introduced into the system. The fetch- 
cycle time of these memories was 1400 ns, the same as that of the core 
memories. In 1979, newer semiconductor memories with 262,144-word 
modules were added. These memories have a 700-ns fetch-cycle time 
capability. Thus, while the 1A Processor central control was designed 
to function with a mixture of all three memory types, the processors 
equipped entirely with the newer semiconductor stores running in the 
fast mode (700-ns cycle) have 30 percent more processing capacity. 

In addition to the memory upgrade, a new high-speed input/output 
processor was developed for the 1A Processor to handle all the various 
I/O functions for the system, including interfaces with operations 
support systems, such as the Engineering and Administrative Data 
Acquisition System and the Circuit Maintenance System. 

Although the frames in the time-division network perform similar 
functions to the original No. 4 Ess network frames, major improve- 
ments have been made in all the frame types. The availability of 
medium-scale-integration bipolar memories to replace the original 
small-scale-integration insulated gate field-effect transistor (IGFET) 
memories in the random access memories of the original frames led to 
a redesign of the TsI and TMS frames. Also, bulk dc-to-dc power 
converters with 100-ampere capacity were developed. Thus, new TsI 
and TMS frames were produced which offered savings in cost, space, 
and power consumption. 

The network clock was also enhanced to allow automatic synchro- 
nization to a master clock source. Even though the original network 
clock had excellent long-term stability, the individual clocks had to be 
periodically manually adjusted to ensure the synchronization neces- 
sary to prevent data from being lost when transmitted over digital 
facilities between digital offices. The new automatic synchronization 
obviates the need for manual adjustments and ensures more reliable 
operation of the toll network. 

The original architecture of No. 4 Ess utilized echo suppressors for 
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long-transmission circuits. These echo suppressors, which employed 
standard analog techniques, were mounted in transmission terminal 
equipment, one per voice channel. Subsequently, a digital echo-sup- 
pressor terminal was designed to operate in a time-shared mode on 
the Ds-120 stream. Functionally the digital echo suppressor is equiva- 
lent to its analog predecessor, but is significantly more economical. 
This evolution will continue with the provision of digital echo cancell- 
ers, implemented with very-large-scale integrated circuit chips. 

There have been significant modifications to the transmission ter- 
minals in No. 4 Ess, particularly regarding the handling of the analog 
carrier interface. Initially, group band carrier was connected to termi- 
nal equipment that converted the signals to baseband. Signaling was 
extracted for processing by the spl while the baseband information 
passed into the viIF where it was converted to PcM and multiplexed 
into Ds-120 streams. With the availability of the ptT/sp2 complex, it 
became economically attractive to use PCM carrier terminals (such as 
D4) to provide the analog-to-digital conversion. Further economies 
were realized by combining functions. An LT-1 connector frame was 
developed to convert 12-channel analog carrier groups to the PCM Ds- 
1 format. Also, a Digital Interface Frame (DIF) was developed to 
handle the pT/sp2 functions. The DIF is a microprocessor-controlled 
frame which uses the most current semiconductor device and packag- 
ing technologies. 

Finally, powerful new functional capabilities have been provided in 
the Bell System network with the addition in No. 4 Ess of the Mass 
Announcement System frame and its associated Peripheral Unit Con- 
trol (pUC) frame. Like the Dir, these frames are microprocessor con- 
trolled and use the latest device technology. 

Table I gives a chronology of the introduction of the major new No. 
4 Ess features described briefly above. 


Table |—Major new features 


1976 Basic toll features plus Common Channel Interoffice Signaling (ccIs). 


1977 Digital echo suppressor, cost-reduced Digroup Terminal (pt), 64K-word 
1400-ns semiconductor store. 


1978 International-call switching exchange functions using ccirT No. 5 and 6 
signaling, cost-reduced Time Slot Interchange (Ts1), Input/Output Proc- 
essor (IOP), and maintenance and administrative enhancements. 


1979 ccis—Inward Wide Area Telecommunications Service (INWATS), 256K-word 
700-ns semiconductor stores, switching control center interface, mainte- 
nance and administrative enhancements, and reduced system reinitializa- 
tion time. 


1980 Digital Interface Frame (DIF), LT-1 connector, cost-reduced Time Multi- 


plexed Switch (TMs), network clock synchronization, mass announcements, 
and maintenance and administrative enhancements. 
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Fig. 3—No. 4 ESS evolution space and power drain (40K trunk office). 


IV. SUMMARY 


When the No. 4 Ess was initially designed, the best available design 
technology was utilized. However, as technology advanced, new op- 
portunities were presented to simplify the No. 4 Ess architecture as 
well as add new features and improve overall system performance. 
Space and power requirements (see Fig. 3), as well as cost, were 
reduced. System reliability has continuously improved throughout the 
period and the system has been kept technologically modern. 

This issue of the Bell System Technical Journal contains a series of 
papers which detail the No. 4 Ess hardware and software evolution 
and discuss some of the significant new capabilities which have been 
incorporated into the system. The issue concludes with a presentation 
of the performance achieved by the No. 4 Ess in the field. 
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Mass Announcement Capability 


By R. J. FRANK, R. J. KEEVERS, F. B. STREBENDT, 
and J. E. WANINSKI 


(Manuscript received August 26, 1980) 


We describe the mass announcement capability that has been 
introduced in No. 4 Ess beginning with the 4E5 generic. This capa- 
bility allows various sponsors to provide services in which a large 
volume of callers can dial advertised numbers to listen to Public 
Announcement Service announcements, register their opinions via 
telephone calls, or participate in call-ins whereby randomly selected 
callers are connected to a celebrity for a live answer. Although these 
kinds of services are not new, they have usually been offered on a 
limited, individually engineered basis at high administrative cost to 
the telephone network. Mass announcement capability provides a 
general means to accommodate sponsor-provided announcement-re- 
lated services which can be offered on a local, regional, or national 
basis. 


1. INTRODUCTION 


Public demand exists for expanded new uses of the telephone to 
provide information and entertainment. Hearing a recorded announce- 
ment over a telephone is increasing in popularity. Radio and television 
stations are encouraging public participation in telethons and call-ins. 
In the past, the scope of these kinds of services has been limited. More 
extensive services, such as a presidential call-in, have required special 
engineering at substantial cost. When the telephone company has not 
been consulted in advance, peaked traffic caused overloads and wide- 
spread congestion in the telephone network. The challenge of the Mass 
Announcement System (MAS) feature is to provide the mechanism so 
that sponsors can offer these kinds of announcement-related capabili- 
ties on a widespread basis to fulfill existing needs and yet to contribute 
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significant revenues for the Bell System. High capacity, flexibility, and 
network protection are key factors because the served area may be 
large or heavily populated, many different services may be provided 
simultaneously, and calling may be stimulated by media programming. 

The MAS is a major part of the 4E5 generic development on the No. 
4 ESS. The No. 4 Ess provides a means of recording announcements 
and the capability to connect a large volume of callers to these 
announcements. The high terminating capacity of the No. 4 Ess and 
its position in the Direct Distance Dialing (DDD) network make it a 
viable switching machine for providing MAS services. The technical 
architecture inherent in the No. 4 Ess time division/space division 
network provides an efficient mechanism for transmitting the an- 
nouncements in a digital format. 

The Mass Announcement System is an optional No. 4 Ess feature. 
No. 4 Ess offices equipped with MAS will be strategically deployed 
throughout the country so that sponsors can provide service on a local, 
regional, or national basis. Each of these offices will be designated as 
an MAS node and its associated calling region will be designated as an 
MAS island. National MAS coverage for initial service in 1980 is illus- 
trated in Fig. 1. 
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I MASS ANNOUNCEMENT FEATURE REQUIREMENTS 
2.1 Definition of offered features 


The mas feature consists of a variety of announcement-related 
services in which a large volume of callers dial an advertised number 
and expect to hear a prerecorded announcement. The main types of 
service are as follows: 

(t) Public Announcement Service (PAS) 
(iz) counting of media stimulated calls 
(tiz) cut through [typically Media Stimulated Calling (Msc) ]. 


2.1.1 Public announcement service 


The pas tends to be stable in nature with reasonably predictable 
calling patterns. Requests for weather or time are typical applications 
which have been carried on the network for many years. More recently 
new forms of PAS have emerged. Some examples are news, horoscope 
offerings, jokes, sports results, and other entertainment programs. 

A No. 4 Ess with the MAs feature provides capability for sponsor- 
offered announcement recordings, quick updating, and the means for 
transmitting these announcements, as shown in Fig. 2. An MAS frame 
can provide a large number of varying length synchronous announce- 
ments which start over again every 15 s. A caller to an MAS announce- 
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ment is generally supplied with audible ringing until the announcement 
starts. 

Public Announcement Service announcements may also be provided 
by an audio source other than the MAs frame. These are called barge- 
in announcements since callers are connected to these announcements 
at any point in the announcement cycle after hearing audible ringing. 


2.1.2 Counting of media stimulated calls 


Counting of media stimulated calls is a service in which a sponsor 
can stimulate callers through advertising to register their opinion by 
telephone on a topic of general interest. If the question posed on 
television, on radio, or in the print media has a yes or no answer, then 
two telephone numbers would be assigned to have corresponding 
significance. If a dozen candidates were nominated for “most valuable 
player,” then it would be necessary to provide 12 telephone numbers. 
Callers hear an MAS announcement and the No. 4 Ess counts each call 
to the specified number. More than one No. 4 Ess office can participate 
in the same MSC counting application. The accumulated counts from 
all No. 4 Ess offices participating in the same application can be output 
to a sponsor’s location in near real time, thus, allowing these msc 
counting applications to be coordinated with radio or television pro- 
gramming. 


2.1.3 Cut-through service 


Cut-through service is another sponsor-offered Msc service which 
permits selective access to a telethon or call-in sponsor while the large 
majority of callers are diverted to a customized MAS announcement. 
The selectivity relates to one call per unit time which is forwarded to 
another DDD directory number to be given personal attention, perhaps 
by a politician or celebrity. Cut-through service can be offered in 
conjunction with an Msc call counting service as a means of soliciting 
additional information from a sample of the callers expressing their 
opinion. 


2.2 Specification of No. 4 ESS system requirements 

Key attributes of the MAS services are specified in the following 
sections in terms of minimum and maximum bounds. 
2.2.1 Call terminations 


The basic capacity of a No. 4 Ess office with a minimum MAS 
equipment configuration of one MAS frame and two dedicated time- 
slot interchange switching and permuting circuits (TSI sPcs) for simul- 
taneous call termination is 896 per dedicated TsI spc or 1792. Using 90 
percent occupancy on the dedicated TSI sPCs, 7.5-s average wait time, 
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and a 30-s holding time per call, there could be approximately 150,000 
calls per hour to one announcement or to a mix of MAS announcements 
available on the two MAS-dedicated Ts1 spcs. To provide perspective, 
the busy hour call capacity of the No. 4 Ess is approximately 500,000. 
An additional 896 simultaneous call terminations can be handled with 
each additional MAs-dedicated TsI spc, of which there is a maximum 
of 15. However, any one announcement can be available from a 
maximum of two MAS-dedicated TSI SPCs. 


2.2.2 Announcement capacity 


There can be from 1 to 8 MAS frames in a No. 4 Ess office. Each MAS 
frame can provide a maximum of 59 30-s MAS announcements. How- 
ever, MAS announcement basic building blocks are in terms of 30-s 
sectors. These 59 sectors can be assigned as desired. If an announce- 
ment lasts less than 30 s, one must nevertheless devote a 30-s time 
sector to that purpose. Announcements can range from 30s to 5 min 
in length. Ten 30-s sectors must be allocated to provide the necessary 
resources for a 5-min announcement. 

Each MAS frame has capacity for a total of eighty 30-s sectors of 
audio. A maximum of 59 of these can contain active audio, that is, 
audio which is playing back to callers. The balance of these sectors 
can be used to store standby audio, which is not yet available to callers. 

The number of barge-in announcements in a No. 4 ESS can vary 
from 0 to 24. Barge-in announcements can exist even though an office 
does not have an MAS frame. 


2.2.3 Announcement characteristics 


Much flexibility exists in defining each MAS announcement in the 
No. 4 Ess. The length of an MAS announcement can be from 30 to 300 
s but must be a multiple of 30s. A barge-in announcement can be from 
5 s to 300 s long. Multiple plays, if specified, allow three options in 
which callers can hear the audio two or three times or repeatedly (for 
about 23 hours). Callers are automatically disconnected from hearing 
the announcement after the specified number of plays. The charge 
option, if specified, results in the No. 4 Ess returning answer supervi- 
sion, thus, resulting in the caller being charged for calling the an- 
nouncement. The forced audible ringing option ensures that the caller 
hears at least one cycle of audible ringing before the announcement 
audio starts. 

Announcement audio can be rapidly updated via any one of several 
methods, most of which are external to the No. 4 Ess. A maximum of 
28 MAS announcements per MAS frame can be simultaneously updated. 

Announcements are available to the calling public according to start 
and stop time parameters for each announcement. These parameters 
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can be specified so that announcement audio is activated for playback 
to callers immediately and plays continuously. Or, these start or stop 
time parameters can specify scheduling up to 23 hr in advance. 

An announcement application can have its audio in the standby 
state or in the active state or in both. However, an announcement 
application can have at most two audio copies or versions existing on 
the MAS disks—one in the active state and one in the standby state. 
Audio in the active state is playing back to callers. Audio in the 
standby state is scheduled to go active and thereby to replace the 
active copy, if it exists, at some specified start time. 

Mass Announcement System announcements are synchronous and 
start from the beginning every 15 s. Thus, callers normally wait from 
0 to 15 s to be connected to the beginning of audio, with an average 
waiting time of 74 s. This wait time may be increased for announce- 
ments defined with forced audible ringing. 


2.2.4 Capabilities for counting media stimulated calls 


Counting media stimulated calls is a service that involves one or 
more MAS announcements, as well as the pegging, collection, and 
output of counts of customer calls on a dialed number basis. 

In a No. 4 ESS MAS node, there are 128 dialed number counters 
which can be used for up to 16 different Msc counting applications 
simultaneously. Up to 64 counts can be collected in connection with 
one MSC counting program. Number patterns such as 900-234-0001 
through 900-234-0064 could be counted separately, yet, cause routing 
to a single announcement which might simply say, “Thank you for 
calling. Your opinion has been counted.” On the other hand, there 
could be a separate announcement for each dialed number or any 
grouping of numbers. 

For one MAS node, the MSC counting totals can be transmitted to a 
sponsor on a minute-by-minute basis via a dial-up connection which 
consists of a data terminal with a mating unit on the other end of a 
DDD connection. Alternatively, if multiple No. 4 Esss are involved in 
the same MSC counting application, each No. 4 Ess reports results to 
a designated master No. 4 Ess, which can transmit the results to the 
service sponsor. Each such No. 4 Ess, which counts customer calls and 
sends these results to a master, is referred to as a slave. A given No. 4 
ESS MAS office can perform slave, master, or both functions. A master 
function, however, can exist in a No. 4 Ess without MAS. However, that 
No. 4 ESS must have the 4E5 (or later) generic. 

A master can have any number of slaves reporting to it. Figure 3 
shows an MSC counting application involving callers in three MAS 
nodes. Slaves update master counts every 5 min via Common Channel 
Interoffice Signaling (ccis) direct-signaling messages. In each No. 4 
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Fig. 3—Counting of media stimulated calls. 


Ess office, there are 128 master counters which can be used for up to 
64 different Msc counting applications simultaneously. A maximum of 
64 master counters can be used for one master application. 

A No. 4 Ess office can simultaneously support a maximum of five 
dial-up connections to sponsors who desire to receive minute-by-min- 
ute MSC counting results as counts are being tabulated. This number 
may be less if the office engineered number of dial-up ports is less. 
However, any number of master applications not exceeding the 64 
limit can transmit to the same sponsor simultaneously over a single 
dial-up connection. 

An MSC counting application can be scheduled according to specified 
start and stop times, or it can run continuously, or it can cycle on and 
off on a daily basis. An msc count scheduling is independent of 
scheduling of the MAS announcement(s) associated with an MSC count- 
ing application. 

Flexibility exists to set up a national Msc counting application in 
which calls are counted at the same hour relative to each time zone. 
For example, callers may be stimulated to call from 7 to 8 p.m. in their 
own time zone. The master can serve as a master and slave in its time 
zone so that calls coming from that time zone are counted only from 
7 to 8 p.m., but counts from other nodes are accumulated until stop 
time has been reached in all slave nodes for this application. 


2.2.5 Cut-through capabilities 
A cut-through application can be applied to any 10-digit directory 
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number.* Up to 64 cut-through applications (shared with network 
management gap controls) can be applied simultaneously in any one 
No. 4 Ess. 

In a cut-through application, one call is cut through to a prespecified 
DDD directory number roughly every N seconds as shown in Fig. 4. N 
is referred to as a call-gapping interval. The call-gapping interval for 
a given cut-through service can be selected from 16 different values 
ranging from 0 to 360 s. 

Each cut-through application must have a different announcement 
associated with it. Conversely, any MAS announcement can have at 
most one cut-through application associated with it. 

Similarly to counting of media stimulated calls, a cut-through appli- 
cation can be scheduled according to specified start and stop times, or 
it can run continuously, or it can cycle on and off on a daily basis. Cut- 
through scheduling is independent of scheduling of the MAS announce- 
ment associated with the cut-through application. 


2.3 Specification of external interface requirements 


Since MAS services may be local, regional, and national in scope, 
coordination of such things as dialable number, announcement capac- 
ity, announcement audio, and service schedules in all the No. 4 Ess 
MAS offices must be administered by one central source. An organiza- 
tion called the Operations Network Administration Center (ONAC) is 
responsible for administration of all MAS services. Two main support 
systems are used by ONAC personnel as shown in Fig. 5. One of these 
support systems is the MAS Support System (Mss), which is a comput- 
erized system minimizing manual administrative functions needed to 


* Gap control, however, is a network management control described in Section 4.3.6 
which can be applied on a 3-, 6-, 7-, or 10-digit basis. 
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define, schedule, and monitor MAS services in all the MAS nodes. The 
other of these support systems is the Announcement Distribution 
System (ADS) which accepts audio from the producer and then can 
simultaneously update all the No. 4 Ess MAS nodes which are to receive 
this audio. Although oNnac and both of these support systems are 
external to the No. 4 Ess, the No. 4 ESS must interface with them in a 
compatible manner. 

In case a backup method is needed, all functions which ONAC 
performs must also be capable of being performed in a No. 4 Ess office 
to administer MAS services in that office. Another external audio 
update method called the direct producer update method is also 
needed as a backup in case ADS system failure occurs, or in case local 
applications may not be using ONAC. 


2.4 Transmission plan 


Since the overriding purpose of an announcement service is to 
deliver an audio product of high quality, considerable effort was placed 
on means to safeguard the fidelity of the ultimate product, voice 
playback. Three noteworthy opportunities to introduce impairments 
are readily identified. First a producer may transport an announcement 
by electronic means to a control location, such as the ADs. If this is a 
DDD connection, there is exposure to noise, loss, and possible echo in 
this transaction. The call must then be fed to one or more No. 4 Ess 
MAS nodes. Although dedicated trunks make possible tighter control 
of transmission variables, noise, for example, will be an additive 
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Fig. 5—No. 4 EsS/ONAC interfaces. 
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impairment. Finally, when digitized recording is played back over DDD, 
new hazards to fidelity must be anticipated. Since control between the 
ADS and the MAS node is most practical, special effort was placed on 
the parameters of this link. Here companding was applied in conjunc- 
tion with the use of a pilot tone to permit positive checks on transmis- 
sion level to be made during transmission. Loss of pilot tone would be 
a positive warning that a gap in the announcement had been experi- 
enced. At the MAS node, the pilot tone is filtered out, levels set and 
feedback provided to the source when repetition of the announcement 
is called for. 

Transmission planning extended to the full network service, just as 
the signaling and switching planning did. It was necessary to evaluate 
peak power and average power effects when one popular announce- 
ment might be dominant over various facilities. The levels of ringing 
tone, busy tone, and the playback itself had to be established. Since 
the network is not homogeneous, simple universal answers were gen- 
erally not to be found. Tentative answers were established, however, 
and means of adapting, if necessary, were investigated and identified. 
Both laboratory and field experiments were conducted. Limits previ- 
ously found necessary to protect switching and signaling also found 
use in the transmission world. Thus, a total requirements package was 
constructed. Within its structure, means are established to allow large 
numbers of callers to have common access to versatile and customized 
telephone announcements. 


lil, NO. 4 ESS SYSTEM ARCHITECTURE FOR MAS 
3.1 Mass Announcement System hardware complex 


The minimum physical equipment configuration for a No. 4 Ess 
with MAS is illustrated in Fig. 6 and consists of the following compo- 
nents: 

e An MAS frame and two moving head disk units record, store, and 
playback digitized voice announcements. (The maximum number of 
MAS frames is eight.) 

© A Peripheral Unit Control (Puc) frame provides common opera- 
tional and maintenance interfaces between the 1A Processor and the 
MAS frame. (One Puc frame can handle a maximum of two MAS frames.) 

© Two Ds-120 links (two coaxial cable pairs) connect an MAS frame 
to the No. 4 Ess network. Record, monitor, and playback channels are 
identified by their time-slot appearances on these links. 

© Two dedicated TSI sPcs provide fanout of the announcement 
phases coming from the Mas frame. Any barge-in announcements and 
audible ringing are likewise fanned out by these dedicated TSI SPCs. 
Incoming MAS calls are terminated on these dedicated TsI spcs. (The 
maximum number of dedicated TsI spcs for MAS is 15.) 
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© Two auxiliary audible ringing trunks from the ringing and tone 
plant provide the ringing that callers hear before start of audio. (One 
of these is needed per dedicated TsI SPC.) 

© Connections from up to 24 barge-in announcement trunks may: 
also be optionally provided as shown in Fig. 6. 

© One or more automatic dial-up 1200-baud asynchronous chan- 
nels (not shown in Fig. 6) are required on the Input/Output Processor 
(10P) frame. These dial-up channels are required for transmitting MSC 
counting results to a remote data terminal. (The maximum number of 
dial-up ports in a No. 4 Ess is six, but a maximum of five of these can 
be used for MSc count reporting to sponsors.) 

© An RCRRT2 (remote recent change) channel (shown in Fig. 5) on 
the 10P frame is remoted to ONAC via a dedicated data link and allows 
ONAC personnel to enter and receive messages from a No. 4 Ess to 
define, control, and monitor MAS services. 

° A dedicated trunk subgroup exists (shown in Fig. 5) between 
ONAC and the No. 4 Ess for recording announcements from ADS. This 
trunk subgroup must be uniquely identified with a special name in No. 
4 ESS. 


3.2 Utilization of dedicated TSI SPCs 


The concept of a dedicated TsI spc was introduced in the initial 
generic of the No. 4 Ess for the fanout of office announcements and 
tones. A dedicated TsI spc is physically the same as a regular TSI SPC, 
except that its transmit and receive ports are looped with coaxial 
cables. A dedicated TsI spc has dynamic fanout capability which 
enables all the callers connected to it to hear one announcement or 
any mix of announcements available on it according to current demand. 
This fanout capability of dedicated TSI sPcs avoids having to engineer 
terminations separately for each announcement service. 

The MAS frame continually plays back active announcement phases 
into the network via the playback channels. Playback channels are 
“nailed up” from serving TSI SPCs to dedicated TSI SPCs as shown in 
Fig. 6. For reliability reasons, the two MAS submembers or units from 
the same MAS frame must be connected to different Ts1 frames. For 
further reliability reasons, the serving and dedicated TsI spcs for each 
MAS submember or unit should be from the same TsI frame. Each 
dedicated TsI spc fans out one audible ringing signal and a maximum 
of 104 announcement phases. 


3.3 Mass Announcement System customer call strategy 


Providing MAS announcements is a unique kind of function for a toll 
switching office. A large volume of customer calls is being terminated 
in, instead of being switched through, the No. 4 Ess. This large volume 
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Fig. 6—Mass announcement system configuration. 


of callers is being terminated on MAS-dedicated TsI sPcs within No. 4 
ESS offices. Since any MAS announcement is available from two dedi- 
cated TSI SPCs, customer calls to that announcement are connected to 
the dedicated TsI spc on which the next phase of the announcement 
occurs so that customer delay time is minimized. Customers hear 
audible ringing and announcement audio from the same dedicated TSI 
SPC. 

An announcement start point in the MAS hardware is randomly 
selected at the time audio is first recorded. Thus, different announce- 
ments should phase at different times. This design was intended to 
spread simultaneous traffic load and the distribution of answer signals. 


3.4 Announcement update strategy 


Before any MAS announcement can be recorded initially, service 
provisioning data must have been input into the No. 4 Ess to define 
the announcement. 

The No. 4 Ess MAS software is designed to accommodate two basic 
methods of recording MAS announcements from sources outside the 
No. 4 Ess MAS office. Both methods are software controlled and require 
no intervention by No. 4 Ess personnel during the recording process. 
These two methods are designated the ONAC/ADS update method and 
the direct producer update method. A manual update method also 
exists for recording MAS announcements from within the No. 4 Ess. A 
report is output to ONAC after each announcement is recorded in the 
No. 4 Ess, regardless of the method. 


3.4.1 Operations Network Administration Center/Announcement 
Distribution System update method 


The ONAC/ADS typically accepts an announcement directly from the 
producer and records multiple copies of it for distribution to the 
appropriate No. 4 Ess MAS offices. Recording calls are then automati- 
cally placed via dedicated transmission facilities to these No. 4 ESS 
MAS offices to transmit the announcement audio. The ONAC/ADS 
compresses the audio and superimposes a pilot tone to ensure the 
quality of the announcement audio during the recording process. After 
recording completes, a report is sent to ONAC to indicate a satisfactory 
update. 


3.4.2 Direct producer update method 


The direct producer update method provides a means for a producer 
to directly update MAS announcements to a No. 4 Ess MAS office. A 
producer can call a predesignated telephone number and record or 
update an MAS announcement. After successful completion of the 
recording process, the No. 4 ESS MAS software originates a callback to 
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the predesignated producer telephone number. When the producer 
answers the callback, the No. 4 Ess MAS office plays back the Mas 
announcement. A producer may accept the audio by listening to the 
entire callback sequence or the producer may reject the recorded 
announcement by hanging up any time during the playback. 


3.4.3 Manual update method 


Personnel within the No. 4 Ess MAs office can make an on-site 
recording from a 51A or 538A test position [in the Trunk Operations 
Center (roc) ]. An input message must first specify the announcement 
identity and several other announcement parameters. There is no 
automatic callback for update verification after the manual update 
method completes. The announcement is automatically marked veri- 
fied. 


IV. MASS ANNOUNCEMENT SYSTEM SOFTWARE ORGANIZATION 


The MAS software package developed in the 4E5 generic of No. 4 
ESS closely interfaces with the new MAS-related hardware to control 
PAS, counting of media stimulated calls, and cut through. This software 
package is organized and incorporated into most functional areas of 
the No. 4 Ess. This section describes the design objectives, character- 
istics, and main capabilities provided by each of the Mas software 
functional areas. 


4.1 Mass Announcement System software design objectives 


The overall design objectives for the MAS software were as follows: 
(1) Compatibility with the existing No. 4 Ess system and environ- 
ment was necessary. The development of all these new capabilities 
involved integrating a large and complex software package into an 
already large and complex software system where resources are becom- 
ing scarce. 

(tt) Hierarchical, modular, and structured programming design was 
advocated. This has benefited understandability, development, and 
maintainability. 

(tut) Reliability was paramount. Defensive checks abound to ensure 
integrity of the services being offered. 

(iv) Audio preservation for an indefinite period of time even 
through office phases and other unusual system disturbances was 
required. 


4.2 Mass Announcement System software characteristics 


Mass Announcement System is the largest feature provided since 
the initial No. 4 Ess development. It is comprised of approximately 
100,000 words of operational software and of approximately 50,000 


1062 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1981 


words of maintenance software. Several new data structures have also 
been created. All of this software is closely tied together and is closely 
interfaced with the MAs hardware, thus, forming a complex but unified 
system. 

The MAS software is basically organized into functional areas. It 
relies heavily on subroutines and is generally separated from other 
software. For example, new call processing code for new MAS related 
types of calls is packaged as new call processing programs. Existing 
call processing code, such as final handling treatment of unsuccessful 
calls, is in modified existing programs. 

Mass Announcement System software has to handle various inter- 
esting cases. Most MAS functions take many time segments to complete 
but occur at infrequent intervals, at least relative to the number of 
calls that the No. 4 Ess handles. For example, an update call stimulates 
many bursts of MAS functions which could total several minutes 
duration, but there should not be a large number of such simultaneous 
update calls. Also, interfaces with the MAS hardware involve delays 
from the time an order is sent until the time MAS completes the 
function. In general, MAS functions are deferred when the No. 4 Ess 
system experiences overload. 

Announcement data is distributed over several data bases. Trans- 
lation data contain permanent service order information. Call store 
structures contain current status information. File store contains a 
backup of nontransient current announcement status information. 


4.3 Mass Announcement System software functional areas 
4.3.1 Announcement Handling 


Announcement Handling plays a dominant role in providing MAS 
services. It is a new software functional area that controls MAS an- 
nouncements and barge-in announcements from the time they are first 
defined until they are deleted from the No. 4 Ess. It is the primary 
operational interface with the MAS hardware (sending almost all the 
operational orders). Announcement Handling also interfaces with al- 
most all other functional areas involved with the Mas feature. 

The main functions of Announcement Handling include the follow- 
ing: 

(t) administrative processing during recording updates and veri- 
fication callbacks, 
(tt) duplication processing for MAS announcements, 
(zit) scheduling of PAs, MSC counting, and cut-through services, 
(tv) providing call processing with announcement phasing infor- 
mation, 
(v) maintaining MAS hardware status as it applies to MAS an- 
nouncements, 
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(vi) providing manual support capabilities, 

(vii) providing file store backup for nontransient announcement 
data. 

To control and administer MAS announcements, Announcement 
Handling maintains a central source of current status and bookkeeping 
information for every MAS announcement in a No. 4 Ess. The primary 
data structures, which are all new to the No. 4 Ess, include the 
following: 

(t) The MAS Announcement Status Table (mMsTAT) is a per an- 
nouncement call store data table. Each entry contains current an- 
nouncement data (such as announcement state, duplication status, 
sector identities, and manual control information), as well as transla- 
tions-like information which is recent changeable (such as announce- 
ment start and stop times and cut-through directory number). 

(ti) The mas Announcement Phasing Table (MAPT) is another per 
announcement call store data table. Each entry contains active an- 
nouncement information used by Call Processing to handle MAs cus- 
tomer calls quickly. 

(tit) The MAS sector busy/idle map is a call store structure contain- 
ing current usage status of each of the 80 sectors available for an- 
nouncement audio storage on each MAS frame in the office. 

(tv) The MAS Announcement Register (MAR) is a four-word call 
store data table seized when needed from a pool of MARs. Announce- 
ment Handling functions, such as duplication, callback, and MAS order 
sending, use these registers in two-way linked lists to queue these 
internal processes (since they may take minutes of time to complete). 
This enables easy processing of these per announcement functions on 
a first-in, first-out basis. 

In performing its functions, Announcement Handling controls the 
announcement audio cycle by processing an announcement through 
its various announcement states. These Announcement Handling func- 
tions are described in more detail in the following sections. 


4.3.1.1 Recording interfaces. Checks are made to see if a recording 
call can be accepted. If so, sectors may need to be allocated in the MAS 
hardware before recording can begin. Announcement Handling selects 
which submember the announcement should be recorded on. If trans- 
mission check failure reports are received from the MAS hardware 
during an ONAC/ADS recording, Announcement Handling informs Call 
Processing as to whether or not the recording call should continue. 
For a given announcement, the first three such call attempts with 
transmission problems are aborted. The fourth such attempt is re- 
corded in spite of transmission problems, but a report is issued so that 
manual actions can subsequently be taken to listen to the audio and 
either accept it or remove it. At completion of recording, several 
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Announcement Handling processes are started such as announcement 
duplication and verification callback for direct producer updates. 


4.3.1.2 Direct producer callback interfaces. Direct producer callback 
requests are queued and, when appropriate, Announcement Handling 
requests Call Processing to initiate an outgoing call to the direct 
producer. There is an initial delay of about 10s to allow the direct 
producer to hang up after recording completes. Thereafter, reattempts 
are initiated once a minute until the direct producer answers, for a 
total of four such attempts. For security reasons, a direct producer 
recording cannot be activated until verification completes. 


4.3.1.3, Announcement duplication. When an MAS frame is duplex, 
announcement audio is recorded onto one MAS submember. Announce- 
ment Handling must send orders to the MAS hardware so that an- 
nouncement audio can be duplicated within the mate submember. 
Several orders must be sent for each sector involved in a prescribed 
time order. Except for maintenance update, the Mas hardware can 
operationally duplicate only one sector at a time from each of the MAS 
submembers. An announcement duplication request is queued in An- 
nouncement Handling as soon as recording completes. The time 
needed to duplicate an announcement (not including time on the 
queue) is equal to the defined length of the announcement plus 15 s. 
Activation can precede duplication if start time occurs before dupli- 
cation is initiated. 


4.3.1.4 Scheduling of PAS announcements, MSC counting, and cut 
through. Announcements can be recorded and scheduled to start im- 


mediately or up to 23 hr in the future. For external recordings, start 
time must be previously entered via a recent change. (This recent 
change could specify that the announcement should start immediately. 
Start times are normally in terms of hours and minutes.) Manual 
recordings are initiated with an input message which specifies start 
time. When a recording begins, a start date is determined based on the 
current value of start time for that announcement. If the start time is 
less than an hour past the present time, it is assumed that the 
announcement should start immediately. When start time occurs, 
Announcement Handling begins the announcement activation process 
by sending orders to the MAS hardware. When this is accomplished, 
the announcement state changes from standby to active. This is a 
gradual process in the MAS hardware and is not complete until all 
phases of the announcement are playing back. 

If there was a previously active version of the announcement when 
start time for the standby version occurs, an active/standby switch 
takes place. The standby version goes into activation and the previ- 
ously active version goes into deactivation. 
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Announcement stop time scheduling is similar to start time sched- 
uling. A stop date is determined when activation completes based on 
the current value of stop time for that announcement. When stop time 
occurs, Announcement Handling begins the announcement deactiva- 
tion process by sending orders to the MAS hardware. Deactivation is 
also a gradual process. When deactivation completes, the audio for this 
version no longer exists. 

Announcement Handling also schedules Msc counting and cut- 
through applications. Whenever start or stop times occur, the appro- 
priate functional area program is invoked. 


4.3.1.5 Announcement phase processing. Active announcements are 
phased in the mAs hardware so that the beginning of an announcement 
starts every 15 s. The total number of phases an announcement has is 
directly proportional to its length. The number of phases equals 
defined length (which must be a multiple of 30 s) divided by 15. The 
MAS submembers autonomously issue a playback phasing report when 
the beginning of a phase occurs. Phases occur on alternate units for a 
duplex system. Announcement Handling processes these phasing re- 
ports by marking a call store table with all the information Call 
Processing needs in order to determine quickly where to connect 
callers at any given time (which dedicated TsI sPc, port, channel, time 
of the phasing report, etc.). 


4.3.1.6 Mass Announcement System hardware status change pro- 
cessing. Announcement Handling updates current announcement sta- 


tus to reflect the current state of the MAS hardware. The MAS hardware 
status changes and the corresponding actions which Announcement 
Handling takes are as follows: 

(t) Restoral with audio lost. The mAs hardware initially restores 
the first MAS submember with audio lost. This same type of status 
change also occurs after a duplex MAs failure due to a fault in the MAS 
hardware. At this restoral time, Announcement Handling issues re- 
ports for any audio lost. Announcement Handling sends orders to the 
MAS hardware to redefine all announcements defined in the No. 4 Ess 
translations data base. When order sending completes, the MAS hard- 
ware is ready to accept recordings. All audio needs to be rerecorded. 

(it) Restoral with audio saved. This type of a restoral of a MAS 
submember takes place after a duplex failure of MAs which was caused 
by a fault in a connecting unit, such as a TSI or Puc. Announcement 
Handling determines at this time whether any audio was lost due to 
the duplex failure. Any audio which was previously simplex on the MAS 
submember still out-of-service is lost and an audio lost report is issued. 
Otherwise, all normal announcement activities on the restored sub- 
member resume. This announcement audio has survived the duplex 
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failure condition. Announcement phasing reports resume from the 
simplex MAS submember. Thus, MAS customer calls can resume. Re- 
cording and other actions can also now take place on this simplex MAS 
submember. 

(tii) Restoral to duplex. The second MAS submember to be restored 
is brought into service via a process called Maintenance Update. In 
this process all audio and announcement definitions and assignments 
from the in-service MAS submember are copied to the out-of-service 
MAS submember. This process takes approximately 2 min. During this 
time all Announcement Handling activity (except the standby monitor 
function) is locked out. After this Maintenance Update process com- 
pletes, all audio is duplex and normal announcement activities can 
resume on both MAS submembers. 

(iv) Removal causing simplex outage. Two basic interfaces exist 
with Peripheral Maintenance. (i) A conditional removal is one result- 
ing from a diagnostic or a manual request whereby the simplex removal 
is delayed until all announcement activity on this MAS submember 
completes so that audio will not be lost. Announcement Handling 
ensures that all recordings in progress on this MAS submember com- 
plete, that customer calls hear at least one play of the longest active 
announcement on this MAS submember, and that all simplex audio on 
this MAS submember is duplicated within the other MAS submember. 
(ti) For a forced removal, all activity on this MAS submember is 
aborted. This includes recording calls and direct producer callbacks. 
All simplex audio on this MAS submember is lost and corresponding 
audio lost reports are issued. 

(v) Removal causing duplex outage. (Both MAS submembers fail 
simultaneously or the second MAS submember fails.) All announcement 
activities in progress at the time of the failure are aborted. No processes 
involving the MAS hardware can take place. Announcement status is 
left unchanged since audio loss is determined at restoral time. 


4.3.1.7 Manual support capabilities. Manual intervention is not re- 
quired to control MAS services except for defining the services, moni- 
toring their current status, and resolving problem situations. Auto- 
matic reports are output regarding announcement situations which 
administrative personnel must be aware of. (The destination of these 
per announcement reports is assigned at the time an announcement is 
defined.) Manual announcement override capabilities are provided 
which are initiated via input messages. These capabilities can be 
exercised by either ONAC personnel or personnel in the No. 4 Ess. A 
list of these per announcement capabilities is as follows: 
({) Manual update can be done from a 51A or 538A test position. 
(iz) Standby audio can be listened to from a 51A or 538A test 
position or from a dedicated trunk to ONAC. 
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(zit) Announcement audio, either standby or active, can be re- 
moved. 

(tv) External updates can be inhibited (and then subsequently be 
allowed). 

(v) Activation can be inhibited even though start time arrives. 

(vi) Update received reports can be inhibited. 

(vit) A standby announcement can be manually marked verified 
(for cases where a direct producer callback failure has occurred and 
the audio has been manually verified). 

(viit) A standby announcement which has been recorded using the 
ONAC/ADS method in which the recording was accepted on the fourth 
attempt in spite of transmission check failures can have the transmis- 
sion problem indication removed from its current status so that acti- 
vation scheduling can take place. 

(ix) Announcement status can be obtained. 

(x) The number of busy disk sectors per MAS frame can be 
obtained. 


4.3.1.8 File store backup of nontransient data. A Machine Updatable 
Data System (MUDS) exists to back-up nontransient MAS announce- 
ment data on disk. Every time a significant event happens to an 
announcement, a disk write request is made. Thus, integrity of impor- 
tant, translations-like information, such as cut-through numbers, is 
provided. This backup information is retrieved after phases in which 
call store has been cleared, after Audits have detected call store 
mutilation, and after recent change rollback situations. 

The MuDsS system also provides a lockout system so that only one 
process can be working on a given announcement at the same time. 
This prevents interfering situations. 


4.3.2 Call Processing 


Call Processing controls the new types of MAs calls, which are as 
follows: 
(t) There are three types of audio recording calls for the various 


update methods available. a 
® the ONAC/ADS update méthod, 
e direct producer upd an 


¢ manual update method using 51A or 53A test position. (Control 
is shared with trunk maintenance.) 
(ti) Callback for direct producer recording. 

(uit) Mass Announcement System customer call. 

The traditional call register is the basic data structure used for all 
these MAS calls. Calls involving connections to a dedicated TsI use new 
data structures called Dedicated Ts1 Connection Registers (DTCRs). 

Another function which Call Processing provides is PUC report 
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dispensing. These reports originate from the MAS frame and indicate 
significant announcement events which either Call Procssing or An- 
nouncement Handling must act on. Examples of these reports are 
recording transmission check failures, recording complete, and an- 
nouncement phasing reports. 


4.3.2.1 Call flow for ONAC/ADS update. Figure 7 shows a simplified 
event flow for the ONAC/ADSs recording method. A more detailed system 
description for a typical ONAC/ADS update call is as follows: 

(t) The aps dials a directory number over the dedicated trunk to 
the No. 4 Ess. This director number is different for each announce- 
ment. At the particular No. 4 Ess, the received directory number is 
recognized as a request to update a particular MAS announcement. The 
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Fig. 7—Simplified event flow for ONAC/ADS recording method. 
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special identity of the incoming trunk distinguishes this recording call 
as an ONAC/ADS update call. 

(it) ADS starts sending a pilot tone after completing digit transmis- 
sion. The No. 4 Ess sends an answer signal and hunts for a record port 
on the appropriate MAS submember. The MAS submember is instructed 
to begin recording (with transmission checks) after the connection to 
the record port is established. 

(tit) The MAS submember autonomously performs the recording 
function. This includes measuring the pilot tone gain and filtering out 
the pilot tone, doing an initial noise check, sending the start record 
tone which alerts aps to start transmitting announcement audio, 
expanding the audio material as it is recorded on the disk, monitoring 
the pilot tone for fading during audio transmission, performing a final 
noise check when the end of the last allocated disk sector is reached, 
and generating a recording complete report. 

(iv) If a transmission check problem is detected at any time during 
the recording, the MAS submember issues a report. The recording call 
will be aborted unless this is the fourth consecutive update call attempt 
encountering a transmission problem for a particular announcement. 

(v) When the recording completes, the record port is released and 
the call is disconnected by No. 4 Ess. Announcement audio for the 
given announcement now exists in the standby state. Announcement 
duplication and start time scheduling are automatically initiated by 
the No. 4 Ess. 


4.3.2.2 Call flow for direct producer update. Figure 8 shows a simpli- 
fied event flow for a direct producer update and subseqent callback. A 
more detailed system description for a typical direct producer update 
call is as follows: 

(t) The producer dials a 7- or 10-digit recording update directory 
number, which is different for each announcement, and is routed to 
the No. 4 Ess. At the No. 4 Ess, the received directory number is 
recognized as a request to update a particular MAS announcement. 

(tt) The call is then connected to a minimum of one cycle of audible 
ringing and the appropriate MAS submember is selected for the call. At 
the end of the ringing period there is approximately one-fourth s of 
silence. At the end of this silent period, answer supervision on non- 
CAMA (Centralized Automatic Message Accounting) trunks is returned 
and on CAMA trunks billing is initiated. Call progress tone is connected 
to the call for a minimum of one-half s while a record port is hunted 
and reserved. 

(zit) The MAS submember is instructed to begin recording (without 
transmission checks). Call progress tone continues until the MAS sub- 
member is ready to start recording. The record port is then connected 


1070 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1981 


DIRECT PRODUCER 
RECORDING METHOD 


AND DIRECT PRODUCER CALLBACK 





DIALED DIGITS TRANSLATE TO 
DIRECT PRODUCER RECORDING REQUEST 
FOR PARTICULAR ANNOUNCEMENT 


AUDIBLE RINGING 
4E SENDS ANSWER 
CALL PROGRESS TONE 
RECORD TONE FROM vias 
AUDIO RECORDING ON MAS 
CALL PROGRESS TONE 


4E DISCONNECTS 


10-SECOND WAIT 


4E ORIGINATES CALLBACK 
TO DIRECT PRODUCER 


DIRECT PRODUCER ANSWERS 


CALL PROGRESS TONE 


STANDBY AUDIO PLAYED 
BACK TO DIRECT PRODUCER 


CALL PROGRESS TONE 


4E DISCONNECTS 


OUTPUT MESSAGE 
“UPDATE RECVD” 


ANNOUNCEMENT ACTIVATED 
IMMEDIATELY UNLESS 
START TIME PREVIOUSLY 
CHANGED 


Fig. 8—Simplified event flow for direct producer update and callback. 


and the producer hears approximately three-fourths s of record tone 
and can then begin input of audio material. 

(tv) The producer must maintain the connection for the allotted 
announcement length, plus an additional 10 s. When the MAS submem- 
ber reports that recording is complete (i.e., has reached the end of the 
last disk sector allocated for recording), the record port is released and 
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the call is connected to a final 10 s of progress tone, after which the 
call is disconnected. If the producer does not disconnect first (i.e., 
before the end of the call progress tone), the recording is successfully 
completed and announcement audio for the given announcement now 
exists in the standby state. Announcement duplication and scheduling 
of the direct producer callback are automatically initiated by No. 4 
ESS. 


4.3.2.3 Call flow for Manual Update. Figure 9 shows a simplified event 
flow for a manual recording. A more detailed system description for a 
typical manual update call is as follows: 

(t) Within the No. 4 Ess an input message is entered to request a 
manual update for a particular announcement. The input message also 
specifies a start time, actual audio length (so that subsequent 
customer calls to hear the announcement can be removed from the 
announcement termination promptly), whether or not transmission 
checks are to be performed, and the identity of the 51A or 53A test 
position trunk to be used. 

(ti) The person making the recording from the test position then 
hears one of two possible sets of tone sequences. If no transmission 
checks are specified (which is the normal manual case), the recording 
scenario follows that described for a direct producer update (after the 
digit reception process.) If transmission checks are specified, the re- 
cording scenario follows the ONAC/ADs update call flow. In this case, 
pilot tone and compressed audio are expected by the MAS submember. 

(tit) When recording successfully completes, announcement audio 
for the specified announcement exists in the standby state. Announce- 
ment duplication and start time scheduling are automatically initiated 
by the No. 4 Ess. 


4.3.2.4 Call Flow for direct producer callback. Figure 8 shows a sim- 
plified event flow for a direct producer update and callback. A more 
detailed system description for a typical direct producer callback is as 
follows: 

(zt) When appropriate, No. 4 Ess originates a call to the producer’s 
callback number and waits for answer supervision to be returned. 
When steady (greater than 2 s with no switchhook transitions) answer 
supervision is received, the call is connected to call progress tone. A 
monitor port on the appropriate MAS submember is hunted and re- 
served. 

(ii) The MAS submember is instructed to play back the announce- 
ment on the monitor port. At the beginning of the announcement 
playback, the call is switched from call progress tone to the monitor 
port to hear the announcement. 

(iii) After one play of the announcement, the call is connected to a 
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Fig. 9—Simplified event flow for manual recording method. 
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final 10 s of call progress tone and then disconnected. If the producer 
does not disconnect first (i.e., before the end of the final call progress 
tone), the audio update is considered to have been verified and will be 
scheduled for activation as soon as possible or at the start time given 
via recent changes. 


4.3.2.5 Call Flow for MAS customer call. Figure 10 shows a simplified 
event flow for a MAS customer call. A more detailed system description 
of the call flow for a typical customer call to a MAS announcement is 
as follows: 

(t) The customer dials a 7- or 10-digit announcement service 
director number and is routed to a No. 4 Ess MAS office. The No. 4 Ess 
recognizes the received number as a request to hear a particular MAS 
announcement. It also recognizes whether MSC counting is associated 
with the dialed number. Next, it determines whether cut-through 
service is in effect and, if so, whether the call should be cut through or 
connected to the announcement. When the dialed number is a MAS 
vacant code, or the requested announcement service is inactive, the 
call will be connected to an appropriate special MAS announcement. 
The preannouncement attempt count is incremented at this point. 

(tt) The No. 4 Ess determines which dedicated TPI spc the call 
should be connected to in order to minimize the time spent listening 
to audible ringing. When specified, the forced ringing option is taken 
into account. If all terminations are busy on the first choice dedicated 
TSI SPC, the call will be connected to the second choice dedicated TsI 
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Fig. 10—Simplified event flow for MAS customer call. 
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spc (for MAS announcements only) or else to busy tone if no announce- 
ment terminations are available. If the MAS equipment should be out- 
of-service so that no MAS announcement can be given, the call is 
connected to a no-circuit announcement from the office announcement 
machine. At the same time, a call is connected to a PAS-dedicated sPc, 
the appropriate announcement occupancy count is incremented; and 
the appropriate count in the geographic separations matrix is pegged. 
This is further explained in Section 4.3.7. 

(zit) At the end of the audible ringing interval, the call is connected 
to the announcement via the same dedicated TsI spc that provided the 
audible ringing. When the announcement connection is made, the 
appropriate announcement completion count is incremented. Answer 
supervision is then initiated (unless specified otherwise) or CAMA 
billing is initiated (if the incoming trunk is cama). If applicable, the 
Msc dialed number count is incremented after answer supervision is 
returned. 

(itv) The customer can abandon at any point during the announce- 
ment. In the absence of early abandonment, the call is automatically 
removed from its MAS termination when the allocated playing time 
elapses (as determined from the announcement length and specified 
number of plays). The announcement occupancy is measured up to 
the point of abandon or forced termination. 


4.3.3 Trunk Maintenance 


Trunk Maintenance software provides a variety of capabilities. 
Many of the manual support capabilities were developed in the Trunk 
Maintenance area to interface with Announcement Handling and will 
be used by ONAC personnel. These functions are described in the 
Announcement Handling section and in the Audits and System Integ- 
rity section. Other Trunk Maintenance capabilities developed for MAS 
are as follows: 

(tz) Trunk Maintenance handles the MAs related trunks. It removes 
and restores the record and monitor trunks corresponding to removals 
and restorals of the associated MAS submember. Playback trunks are 
kept nailed up to the dedicated TSI SPC. 

(it) The Trunk Operations Center alerting software informs ONAC 
of duplex MAS failures and restorals. 

(viz) Trunk Maintenance messages can be requested to include per 
call information on ineffective attempts for update calls and MAS 
customer calls. These are useful for resolving problems with unsuc- 
cessful update calls and MAs customer calls. 

(tv) Trunk Maintenance shares control with Call Processing for 
manual recordings which use the 51A or 53A test position. The trunk 
maintenance calls for recording and listening to standby audio use the 
traditional trunk maintenance register (TMR) for per call data. 
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4.3.4 Peripheral maintenance software 


The MAS and Puc frames are two new peripheral units for No. 4 Ess. 
The peripheral maintenance software functions provided for these new 
frames are as follows: 

(t) Bootstrap capability during office phases 2 and higher confi- 
gures these frames as appropriate to the escalation rate of phases and 
severity of the problem(s). Duplex configuration of MAS and PUC are 
preserved if possible. 

(it) Input message or Power Control Switch (pcs) functions are 
manual means which request removal, diagnosis, and restoral of PUC, 
MAS, and associated hardware units. 

(tit) Master Control Center (Mcc) panel and frame status monitor 
and display whether or not the PUC, MAS, and associated units are in 
service. 

(itv) Diagnostics and routine exercise programs run automatically 
to ensure the sanity of the mAs and Puc hardware. Trouble location 
procedures and diagnostics can be requested manually in attempting 
to resolve problems. 

(v) Fault recovery routines are executed when faults are detected 
for any of the PUC, MAS, or associated hardware units. These errors 
can be reported through f-level, interject level, or base level. 

Because of the unique design of PUC and MAs frames that use 
microprocessors, the Peripheral Maintenance software design for these 
frames reflects significant changes from those used to handle conven- 
tional frames in past generics. For example, the new design takes into 
account the intelligence built in the microprocessors in the implemen- 
tation of the configuration and recovery actions. 

The Peripheral Maintenance software also has special interfaces 
with Announcement Handling to ensure that audio is not unnecessarily 
lost. For example, before one of the MAS submembers can be removed 
from service for diagnostics, sufficient delay time must take place to 
ensure that all recordings in progress on that MAS submember complete 
and are operationally duplicated to the other in-service MAS submem- 
ber and that all other announcement processes complete. 

Craft procedures for maintaining the MAS and Puc hardware are 
different from those for other frames in the office. There may be delays 
which must be observed when removing, restoring, or diagnosing MAS 
submembers. Special care must be taken to preserve audio on the MAS 
disks. 


4.3.5 Recent change and verify 


Certain MAS entities, such as the basic equipment configuration, 
must be defined with other office dependent data at generic retrofit 
time. This includes the assignment of the MAS playback, monitor, and 
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record channels, auxiliary audible ringing, barge-in playback trunks, 
and PAS-dedicated TsI spcs. The nailed-up connections between serving 
TSI SPCs and dedicated TsI spcs for MAS frame playback channels and 
barge-in playback trunks are also exclusively defined at retrofit time. 
Although individual announcement services can be established initially 
in the No. 4 Ess, it is expected that the Recent Change system will be 
the primary means by which this is done. 

Recent Change is the system used by the No. 4 Ess to modify the 
No. 4 Ess translations data base. New capabilities have been added to 
the Recent Change system to allow the addition, modification, and 
deletion of PAS announcement and cut-through applications. The Msc 
counting applications can likewise be added and deleted via recent 
changes. Modifications have been made to existing recent change 
system for code grouping to allow directory number translations for 
MAS announcements and recording updates. Modifications have been 
made to existing dial-up port recent changes to allow MSC counting 
results to be output to remote data terminals. Modifications have been 
made to existing trunk subgroup recent changes to allow geographic 
separations data to be collected on calls to MAS announcements. 

The Verify system is the means to retrieve and display the data 
residing in the No. 4 Ess translations data base. New capability has 
been added in conjunction with the new recent changes to retrieve 
data concerning MAS announcements, MSC counting applications, cut- 
through applications, and dial-up. 


4.3.6 Network management 


Network Management software is a direct contributor to cut- 
through service. The call-gap timer is set to a present time, plus a time 
interval (gap). The first call to arrive upon expiration of the timer will 
be forwarded to a prespecified DDD number. The timer is then reset to 
the present time, plus the gap interval. Mass Announcement System 
calls which arrive before the timer has expired are connected to a 
prespecified customized announcement at the No. 4 Ess MAs office. 
This process is continued until the cut-through service is deactivated. 
The gap interval also includes an offset interval used to distribute calls 
over time to prevent bunching of calls from several offices at once. 
The offset interval also prevents favoring calls from one office. 

Other Network Management functions which affect MAS are as 
follows: 

(t) Inhibiting reroute control of MAS calls to prevent traffic over- 
flow from one MAs island to another. The data found in the office 
translations can be used to specify nonreroutable codes. 

(ti) Gap control is anormal Network Management control available 
in MAS and non-MAs offices that may be used to meter the amount of 
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traffic (i.e., selective choke) or to override cut-through values in MAS 
offices. To provide Network Management capability so as to protect 
the network during heavy calling periods or during traffic congestion 
due to facility failures, an override of the service-order specified call 
gap interval is made available in MAs offices. Intermediate No. 4 Ess 
offices which are non-MAS offices can apply gap controls also to protect 
the network. Gap control can be applied on a 3-, 6-, 7- or 10-digit basis. 
Calls which are not allowed to complete are connected to an emergency 
or no circuit office announcement as specified by the gap control. Gap 
control is accomplished via a new control page (CNO8) and the already 
existing Network Management cathode-ray tube (CRT) display system. 
The same gap intervals are available as for cut-through applications. 

(tit) Reports are output to ONAC whenever gap controls are applied, 
removed, or have their interval changed by the network manager. 

(tv) In addition to the normal display page capability provided by 
the Network Management crT display system, the control page pro- 
vides inventory displays for MAS cut-through, MAS MSC counting, and 
MAS announcements. In addition, the capability to select the previous 
5, 10, or 15 min worth of data is also available. Mass calling congestion 
is most likely to accompany the MAS-type of services. Therefore, these 
displays of MAS data are made available to monitor this type of traffic, 
determine the source of the problem, and control this traffic if neces- 
sary. 


4.3.7 Traffic and plant measurements 


All MAs-related measurements used to administer and maintain No. 
4 ESS MAS offices and the interconnecting network are called traditional 
measurements. These types of measurements have traditionally pro- 
vided the data required to measure service volume and quality, detect 
weak spots in system performance, guide maintenance activities, en- 
gineer future equipment additions, and determine the division of 
revenues. Eighty-one new MAS-related measurements have been pro- 
vided and are collected on a 15-min basis and stored in the traffic and 
plant measurement data base. 

Geographical survey measurements are contained in a 32 by 32 
matrix which provides information about the place of origin of calls to 
selected MAS announcements. This information may be valuable to the 
sponsors of some announcements. For example, the information might 
be used to analyze the impact of advertising in various geographical 
areas. This geographical survey matrix contains completion peg counts 
based on incoming trunk subgroup and the announcement called. 


4.3.8 Counting of media stimulated calls and dial-up port 


The MSC counting service uses two existing capabilities in the No. 4 
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ESS, namely, CcIs direct signaling and dial-up port. The use of ccIs 
direct signaling for MSc count transmittal is actually the first applica- 
tion of ccis direct signaling in the field. The Msc counting service is 
comprised of two basic functions: slave application processing and 
master application processing. 


4.3.8.1 Slave processing for msc counting. When start time arrives 
for an MSC counting slave application and it is activated, all the 
involved counters are cleared and an indicator is set so that Call 
Processing pegs counts for each dialed number associated with the 
slave application. 

The counts for a slave application are sent to the master No. 4 Ess 
machine every 5 min using CCIs direct signaling messages. The specific 
times for sending these counts are determined by the slave service 
start time together with the skewing factor. The skewing factor is a 
number ranging from 0 to 300 s which represents the offset, in seconds, 
from the start time when the first set of counts is to be sent. Counts 
are sent every 5 min thereafter. This skewing factor is intended to even 
the load on the network and on the master No. 4 Ess machine. 

Each MSc counting ccis direct signaling message consists of eight 
signaling units. Included in this message is the master application 
number to which the counts are being sent and two four-digit line 
numbers together with their associated 5-min peg counts. The number 
of ccis direct signaling messages required to send all the counts for a 
given slave application is equal to the number of line numbers associ- 
ated with the slave application divided by two. 

The rate at which these ccis direct signaling messages are sent is 
metered so that a maximum of one message per second is sent for a 
given application. The impact of this number of messages is minimal 
when compared to the potential CcIs network capacity. This ensures 
that MSC counting services will not by themselves force a ccIS terminal 
into congestion. 

In cases of ccIs blockage or network overload, returned MSC counting 
ccIs direct signaling messages are accepted by the slave office. Receipt 
of such a returned message triggers implementation of a control to 
meter more stringently the rate at which messages are sent to that 
destination. When a control is applied, only one message is sent to 
that destination every 10 s. Such a control applies for 2 min and can 
be extended if any of these more stringently metered messages are 
returned. The counts from each of these returned messages are added 
into the current peg count so that these counts are not lost. 

When stop time arrives, counts continue to be sent for five more 
minutes to ensure that the master No. 4 ESs receives all the counts. If 
counts could not be sent during this 5-min period after stop time, a 
report is sent to ONAC listing all the unsent counts. 
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4.3.8.2 Master processing for msc counting. Fifteen min before the 
start time, the master application is initialized. All the associated 
counters are cleared. If a dial-up connection is not required, a report 
is output to ONAC stating that the application has started. If a dial-up 
connection is required for this application and a connection to the 
same destination is already up, a report is output to ONAC that the 
application has started and these numbers start to appear on the 
sponsor’s output terminal. If a dial-up connection is required and one 
is not already up, dial-up port software is used to dial up the ONAC 
count distributor automatically 15 min prior to the master application 
start time. Every minute after start time the last four digits of each 
directory number together with the cumulative count for that directory 
number are output from the No. 4 Ess. The ONAC count distributor 
transmits this count information to the sponsor’s location(s). This 
count distributor provides local monitoring for problems and infor- 
mation content and increases reliability. 

If a dial-up connection goes down, a report is sent to ONAC and 
another report to the No. 4 Ess office personnel. Three automatic 
attempts are made to reestablish the connection, each of these at- 
tempts being 2 min apart. Another report is issued as to whether or 
not the connection could be reestablished. The counts, however, are 
not cleared in the reestablishment process. 

Counts continue to be sent to the sponsor for approximately 8 min. 
after stop time for the master application. Then a final message is 
printed to the sponsor stating that the service has stopped. 

A number of MSc counting manual capabilities exist as follows: 

(z) A master application can be initialized manually. 
(tt) An option exists for manually zeroing counts. 

(iit) A master application can be stopped if problems arise or if a 
graceful shutdown is desired for an open ended application. (Once a 
master application is manually stopped, it will not automatically start 
again. It must be manually initialized.) 


4.3.9 Audits and system integrity 


To ensure the integrity of the various data bases involved with MAS, 
several Audits have been developed. These Audits periodically look 
for inconsistencies between the contents of MAS 

(it) administration software structures in call store, 
(ti) administration software structures backed up in file store, 

(iii) translations, 

(tv) firmware data structures. 

If an error is found, the corrective action involves reinitializing all 
data structures pertaining to the involved announcement(s), including 
those in the MAs hardware. Thus, all audio for the involved announce- 
ment(s) is lost. 
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Special treatment was necessary for System Integrity and especially 
for Announcement Handling and Peripheral Maintenance to ensure 
that MAS announcement audio is not lost during office phases up to 
and including a phase 4. All call store data structures related to MAS 
announcements are cleared in phases 2, 3, and 4. Certain call store 
MAS announcement structures are then rebuilt by retrieving backup 
data from file store. Audio for announcements in transient states is 
lost. [All MAs audio can be removed during phase 4 by manually 
depressing a Direct Data Insert (Dp1I) key on the Master Control 
Center (McC) panel and subsequently running a phase with the Modify 
Recovery Action key set. ] 

All the new MAS-related calls, described in the call processing section, 
are taken down in phases 2, 3, and 4. These calls are considered 
transient because they cannot be rebuilt. (These calls are still con- 
nected to a call register (in the case of update calls) or they are 
connected to a dedicated Tsr. In both of these cases, the connection 
information resides in call store which has been cleared.) 


V. SUMMARY 


An MAS announcement capability has been developed for No. 4 Ess 
to take advantage of its position in the DDD network. A large complex 
software package coupled with two new hardware frames introduced 
in the 4E5 generic of the No. 4 Ess provides capabilities so that 
sponsors can offer PAS announcements, counting of media stimulated 
calls, and cut-through services on a local, regional, or national basis. 
This development provides high capacity, flexible services that also 
include network protection aspects. Many of the MAS announcement 
capabilities are unique capabilities for toll switching offices. The MAS 
software package, which is comprised of approximately 150,000 words, 
is organized into functional areas and involves almost all functional 
areas in the No. 4 Ess. 
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This paper describes a new hardware subsystem developed to 
provide mass announcement capabilities for No. 4 Ess. The subsystem 
records, stores, and plays back recorded announcements for distri- 
bution through the No. 4 Ess switching network. Announcements are 
stored in digital form on a moving-head disk system. Microprocessors 
are used for control of disks and of interfaces. Duplicated hardware 
ensures high reliability, and extensive self-testing capability is pro- 
vided. 


|. INTRODUCTION 


Provision of the mass announcement capability on the No. 4 Elec- 
tronic Switching System (Ess) requires the system to record and store 
voice announcements, and to play them back to large numbers of 
calling customers. The capability of instantaneously creating multiple 
copies of an announcement and distributing the copies to many callers 
is inherent in the design of the No. 4 Ess digital switching network,’ 
and is a principal reason for choosing No. 4 Ess as the vehicle for the 
mass announcement service. Recorded announcement hardware al- 
ready in No. 4 Ess lacked the capacity and features needed, so a new 
hardware subsystem was designed. 

This paper describes the hardware subsystem portion of the No. 4 
ESS mass announcement capability. An overview of the entire capabil- 
ity, including No. 4 Ess processor software and interactions with the 
telephone network, appears in a companion paper.” 

Architecture of the new mass announcement subsystem is influenced 
by the existing No. 4 Ess architecture and interfaces, and the functional 
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requirements and reliability objectives of the new service. For example, 
control information between the No. 4 Ess processor and the mass 
announcement subsystem is carried via the Peripheral Unit Bus (PUB) 
system which is used as the interface to the switching network and 
transmission interface equipment. This bus interface requires equip- 
ment similar to that used in other No. 4 Ess peripheral hardware 
frames, such as the Digital Interface (D1F) frame.*® The simplest inter- 
face for voice signals to the all-digital No. 4 Ess switching network is 
the serial digital pulse code modulation (pcm) encoded format used for 
internal transmission within the No. 4 Ess network. This format was 
chosen for the recording and playback interface, and for storing the 
recorded announcements. 

Figure 1 shows the major components of the mass announcement 
subsystem. Access to the No. 4 Ess processor is via the Peripheral Unit 
Control (PUC) equipment frame, which provides a standard bus inter-: 
face plus control circuitry and programs to allow one PUC to serve one 
or two Mass Announcement System (MAS) frames and, potentially, 
also to serve additional frames containing features yet to be designed. 


TO 
NO. 4 ESS 
PROCESSOR 







PUB 0 PUB 1 


BUS 0 BUS 1 
ACCESS ACCESS 


ea 
a ™. 


CONTROLLER CONTROLLER 
0 1 
PUC FRAME 







NO.4 ESS 
SWITCHING 
NETWORK 


rm. 


PCM 
VOICE 
PATH 


EIB O 


MAS FRAME .. 
CONTROL 


PATH 


Fig. 1—Mass announcement subsystem bock diagram. 
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The Puc’s full duplex bus interface and duplex controllers allow full 
service to continue even if an internal Puc unit fails. 

Recording and distribution of announcements is performed by the 
MAS frame (Fig. 1). This frame contains two identical Mass Announce- 
ment Units (MAuUs). Each has a disk controller and is associated with 
an 80-Mbyte moving-head disk system for announcement storage. In 
contrast to the Puc, whose duplex controllers perform identical tasks 
in synchronism, the two MAs disk controllers operate independently. 
All announcements are stored on both disks, however, so that if one 
unit is lost, all announcements remain available. A temporary service 
requiring longer customer waiting time for an announcement to start 
is provided in such cases. 

In both the puc and MAS frames, microprocessor systems are used 
extensively. Both the mas disk controller, which controls data flow to 
and from the disks, and the Puc executive controller, which governs 
PUC internal data movements and the interface to MAS, are high-speed 
bipolar bit-sliced microprocessors. The PuC also contains a slower 
BELLMAC®-8 microprocessor for background maintenance tasks, and 
initialization. 


ll. SYSTEM INTERFACES 


A number of external and internal interfaces exist in the mass 
announcement subsystem (Fig. 1). This results from the structure of 
No. 4 Ess and from the characteristics of the PUC and MAs frames. 

Voice signals for recording and playback of announcements are sent 
via a coaxial Ds-120 high-speed serial Pcm data link between each MAS 
unit and the switching and permuting circuit (SPC), that serves the MAS 
unit in a Time-Slot Interchange (Ts1) frame in the No. 4 Ess network. 
Fach link provides 120 two-way voice and eight maintenance channels. 
Sixty channels are used for playback, including certain channels re- 
served as “monitor channels’ for verifying announcement integrity 
after recording. Fourteen channels are used for recording, and certain 
other channels are used for maintenance purposes. The interface 
carries no control information other than for timing and synchronizing 
the link itself. 

Control information for the subsystem is carried via the PUB from 
the No. 4 Ess processor to the puc frame. This is a 96-bit (total both 
directions) parallel interface under control of the processor. Frames 
are addressed via coded enabling bit fields on the bus. A wide variety 
of operational and maintenance orders destined for the Puc frame and, 
via the Puc, for the MAs frame are sent over this interface. 

Additional external interfaces to the subsystem include dc loop and 
ac pulse leads from No. 4 Ess signal processor frames. These links are 
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used to monitor the status of both puc and MAs frames, and to provide 
certain subsystem configuration functions. Finally, the Puc receives 
timing information from TsI frames. 

Within the subsystem, the major interface involves the PUC and MAS 
frames. A bus system used internally in PUC is extended to serve one 
or two complete MAS frames, and is designed to accommodate addi- 
tional units in the future. Each Puc controller provides two extended 
internal buses (EIBs); all MAS units connect to one of the buses from 
each Puc controller. Each bus contains a 24-bit, parallel two-way data 
field. Data flow on each bus is under control of its associated Puc 
controller. Source and destination fields govern the transfer of infor- 
mation in either direction between internal PUC and MAS registers. 
Additional leads are provided for handshaking and error-control pur- 
poses. 


ll. EQUIPMENT DESIGN 


Throughout the mass announcement subsystem, the recently intro- 
duced BELLPAC* packaging system technology is used.* The major 
subsystem components are the puc frame, the MAs frame, and the disk 
systems. Figure 2 shows a photograph of the complete subsystem. 
Both frames use BELLPAC packaging system technology, and are 39 
inches wide and seven feet high; the puc frame is 12 inches deep (usual 
for No. 4 Ess), while the MAs frame depth is 18 inches. The MAS frame 
depth reflects use of circuit packs also used in the 3B Processor system. 

The puc frame houses one duplicated peripheral controller unit, a 
duplicated PuB interface, and associated power equipment. A vertical 
cabling trough divides controllers 0 and 1; the controllers are generally 
mirror images. The two PUB interfaces are located above the controller 
units, and principally contain the cable drivers and receivers required 
to interface to the No. 4 Ess processor. Power supply equipment for 
each controller unit is located in the lower part of the Puc frame; +140 
V input power is converted to +5 and —5.2 V for use within the frame, - 
and +24 V input power is used directly. Power for the PUB interfaces 
is separate and is derived from converters located in the PUB units. 
Electronic sequencing, regulation, and overload control is provided on 
circuit packs located both in controller and PUB units. 

The MAS complex contains a single-bay frame and two 80-Mbyte 
moving-head disk drives (Fig. 2). The two disk drives are located on 
each side and adjacent to the MAS frame. The frame is equipped with 
two identical MAUs and associated power equipment. Each MAU is 
associated with one disk and consists of a controller and circuits 
interfacing to the puc, the disks, and the No. 4 Ess switching network. 


* BELLPAC is a trademark of Western Electric. 
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The lower half of the mas frame is used to house the power 
equipment. To provide autonomous operation of the two MAUs and 
the associated disk drives, duplicate power feeders of +140 V, +24 V, 
—48 V dc and 208 V single-phase ac are cabled to the frame from their 
respective No. 4 Ess office power plants and from the office power 
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Fig. 2—Mass announcement subsystem frame complex. 


service circuit. The +140 V is converted to +5 V, —5 V, and +12 V dc 
for use by the mAus. The —48 V is inverted to 208-V ac for use by the 
disk drives whenever commercial ac or central office essential ac is 
interrupted or falls outside the drive’s operating limits. 

The circuit packs in both the puc and as frames use both Schottky 
and low-power Schottky transistor-transistor logic, and high-speed 
emitter-coupled logic contained in dual in-line packages (piIPs). Three 
different sizes of circuit packs are used; the sizes vary from 4 by 9 in. 
to 8 by 13 in. Connector pinouts available on these packs number 100 
or 200 pins. 

The circuit pack technologies used are the double-sided rigid board 
and the multilayer board with both external and internal power and 
ground planes. Both types are nominally 0.0625-in. thick. 

The double-sided rigid board is an epoxy-glass board with etched 
copper-printed wiring on both sides. Path widths range from 0.006 in. 
to 0.050 in., and plated-through holes of 0.020 in. are used. This board 
is primarily used for low-density circuitry. 

The multilayer boards are used in four- and six-layer versions and 
are used for high pip packing densities. The four-layer boards use the 
two external layers for distributing power and ground and for the 
connector fanout patterns. The two internal layers are assigned to 
signal routing. 

In the six-layer versions, power and ground planes occupy the 
innermost internal layers, which improves electrical characteristics. 
Several voltage levels can be provided through segmentation. The two 





Fig. 3—BELLPAC™ packaging system technology TN circuit pack. 
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Fig. 4—Mass Announcement System floor plan. 
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external layers are generally used only for soldering pads and for 
connector fanout patterns, but can also be used for signal routing, if 
necessary. The remaining two internal layers are for signal routing. 
Signal paths on the multilayer boards can be as large as 0.025-in. wide 
and can be decreased to 0.008 in. where two paths pass between DIP 
terminals or plated-through holes. 

A six-layer 8- by 13-in. circuit pack is shown in. Fig. 3. Both power 
supply filtering and a large number of decoupling capacitors are used 
on this pack. These capacitors are judiciously placed to minimize noise. 

A maximum of eight MAS complexes, each consisting of one MAS 
frame and two disk drives, may be installed ina No. 4 Ess office. 
However, to meet reliability objectives, a maximum of only two MAS 
complexes may be connected to a Puc frame. 

A typical central office floor plan for one mass announcement 
subsystem is shown in Fig. 4. Because of the disk size, the MAS complex 
consumes the space normally allotted to two frame lineups. The disks 
also provide a convenient break in the frame lineup so a MAs frame 
with a depth of 18 in. can be used; most No. 4 ESs frames are 12 in. 
deep. The Puc and As frames are placed in the same lineup to keep 
the interconnecting cables as short as possible. 


IV. PERIPHERAL UNIT CONTROL CIRCUITS 


The puc frame provides a control interface between the No. 4 Ess. 
processor (1A Processor Common Control) and new equipment that 
must be controlled by the processor. The mass announcement subsys- 
tem frame is the first user of the Puc, but the Puc has been designed © 
so that future services can be added easily, with only PUC micropro- 
cessor firmware changes required. This eliminates the need to develop 
a new processor interface for each new hardware system, and saves on 
the cost and time required to add services to the No. 4 Ess. 

Much of the puc circuitry is similar to the controller of the DIF, 
which is discussed separately.’ We review briefly the common portions. - 
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Fig. 5—Peripheral Unit Control frame controller block diagram. 


The main unique features of the Peripheral Unit Control circuits 
include the extended internal bus which connects to the mass an- 
nouncement frame, and the microprocessor programs or “firmware” 
that govern the controller’s operation. 

Each puc controller centers about an executive controller (Fig. 5), 
a high-speed bipolar-technology microprocessor which provides proc- 
essing functions and controls data transfer operations on an internal 
bus. The executive controller has access to registers and buffer mem- 
ory, and controls interfaces to the PUB and mass announcement frame. 
A second maintenance processor can be used for background tasks. 


4.1 No. 4 ESS processor interface 


The duplicated PuB provides the data and control path between the 
No. 4 Ess processor and the Puc frame. Figure 1 shows the interface 
between the No. 4 Ess processor and the Puc frame. The PUB interface 
provides a fully duplicated communication path between the processor 
and the puc frame. Each bus consists of four groups: the enable 
address bus, the write bus, the reply bus, and the control bus. The 
enable address and write buses (PUWB) convey instructions from the 
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processor to the Puc, while data from the PUC is sent to the processor 
via the reply bus (PURB). Control and maintenance information is 
transmitted to and from the Puc over the control bus. Each of the 
duplicated Puc controllers has full two-way access to both buses; bus 
access routing is controlled by the No. 4 Ess processor through flip 
flops in the Puc. 

Circuits that provide access logic to the bus consist of a receiving 
register, a reply register, a sequencer, and checking circuits. Orders 
from the processor to the PUC are sent over the PUWB. The order is 
latched in a receiving register, and checked for validity and for the 
correct address code. For valid orders, the hardware generates a high- 
priority interrupt to the executive controller, which then processes the 
order. When processing is complete, the results appear in the reply 
register, usually within 20 ys. The bus access hardware then gates the 
reply data onto the peripheral unit reply bus. The bus access logic 
removes significant real-time overhead from the executive controller. 


4.2 Executive controller 


The executive controller is a microprogrammed bit-sliced processor 
with a basic cycle rate of 4 MHz. It accepts interrupts from three 
external sources: from the No. 4 Ess processor, from the MAUs, and 
from the maintenance processor located within each Puc controller. 
An interrupt request points to a starting address in the firmware 
microprogram, a set of routines that route data between different 
registers and memory locations. The firmware microprogram is con- 
tained in 4096 words of read-only memory (ROM); each word is 40 bits 
wide, of which eight bits are used for parity checking. Processor 
hardware consists of ROM, sequencer, interrupt control, and arithmetic 
and logic circuits. 

Addressing for the microprogram ROM is by sequencer circuits, which 
provide for conditional program branching. External interrupts are 
handled by a 16-level priority interrupt controller, which passes a 
starting address to the processor when an interrupt is received, and an 
8-bit-wide arithmetic and logic unit provides computational power. 
Certain critical circuits (arithmetic unit, sequencer, and interrupt 
control) are duplicated within each controller; matchers between the 
duplicated circuits provide improved fault detection. 


4.3 Registers and buffer memory 


Kach controller in the Puc frame contains a set of internal special- 
purpose operational and maintenance registers. Operational registers 
include a status register, reflecting critical configuration and data 
routing states; receiving and reply registers for incoming and outgoing 
orders from the No. 4 Ess processor; and a cutoff register used to 
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isolate MAS units from the Puc. Maintenance registers include error- 
source registers (ESRs), or error indicators, for hardware failures; an 
exercise register that creates abnormal conditions to verify the opera- 
tion of error-detection circuits; and a pest register for disabling indi- 
vidual error indications. 

The puc controller has access to a 24-bit, 256-word random-access 
memory (RAM). This RAM is logically divided into operational and 
maintenance buffers, and is principally used as intermediate storage 
for MAS operational and maintenance reports to be forwarded to the 
No. 4 ESS processor. 


4.4 Internal bus 


Data transfer within each Puc controller is via a 24-bit data bus; in 
addition, 6-bit code fields are provided to select the source and desti- 
nation of each data transfer. Each possible source register is assigned 
to one port of a 16-port multiplexer; the source code selects the 
appropriate register and port. The output of the multiplexer is routed 
to all possible destination registers; the proper register receives the 
data in response to the appropriate destination code. 

The internal puc data bus is extended outside the frame to serve 
MAS and other circuits that may be provided in the future. One 
bidirectional data bus serves all MAS units connected to each PUC 
controller. Tristate, dc-coupled cable drivers are provided at each unit; 
cutoff leads are provided to disable MAS units suspected to be faulty. 
Additional control leads are provided for handshaking and synchroni- 
zation of data transfers between PUC and MAS. 


4.5 Maintenance processor 


A maintenance processor is used for localized diagnostic and fault 
recovery within the controller. This isa BELLMAC-8 single-chip, bus- 
structured, general-purpose microprocessor, with 60K words of ROM 
program and 4K of RAM. The maintenance processor is capable of 
interrupting the executive controller and simulating No. 4 Ess pro- 
cessor orders. Fast, direct access by the No. 4 Ess processor to the 
maintenance processor RAM is provided. Maintenance processor pro- 
grams include an operating system, bootstrap routines, maintenance 
processor diagnostics, and application programs used to initialize the 
Puc frame. 


4.6 Peripheral Unit Control programs 


The Puc processor complex is programmed to handle operational 
and maintenance instructions from the central control and units on 
the extended bus. The executive controller programs that handle these 
jobs are organized as a hierarchy of tasks entered from a control 
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program that services interrupts and controls job scheduling. Since the 
controller is an interrupt-driven processor, the program services inter- 
rupts from the No. 4 Ess processor and MAS interfaces, plus mainte- 
nance interrupts, such as errors and real-time clock interrupts, which 
initiate exercise and audit routines. 


4.7 No. 4 ESS processor order-handling programs 


Processing No. 4 Ess processor orders that have been detected and 
validated by bus access circuitry is a principal function of the executive 
controller. The hardware first generates an interrupt request; if an 
autonomous or background task is running, the task is interrupted at 
an appropriate point and the incoming order is processed. The opcode 
part of the order is used as an index to branch to the program. The 
remaining parts of the order are used as data or address information 
to access specific registers or memory, or initiate multiple-operation 
macro functions. For a simple read register order, the task routine 
moves data from a register selected by the program into the peripheral 
bus reply register. Similarly, for a write order, data moves from the 
peripheral bus receive register to a destination register. When the 
write order is completed, the reply register is loaded with an order to 
return an all-seems-well (Asw) acknowledgment to the No. 4 Ess 
processor. Controller hardware takes over once the reply register is 
accessed; a hardware sequencer is used to return ASw and data to the 
No. 4 ESS processor in conformance with bus timing requirements. 
Execution then returns to the program that had been interrupted, or 
to an idle routine if no jobs were active when the No. 4 Ess processor 
order interrupt occurred. 


4.8 Mass Announcement System order-processing programs 


A unit on the puc extended bus, such as MAS, initiates communica- 
tion with the No. 4 Ess processor by loading reports in buffers in the 
PUC executive controller RAM. Processor orders periodically unload 
these reports. The reports may be responses to macro tasks previously 
initiated by the processor, or the unit may initiate reports autono- 
mously due to operational or maintenance conditions. To load a report 
in the Puc, the unit loads the report type and data in its reply register 
and signals on a common party-line interrupt request lead. The puc 
interrupt-handler routine polls the units on the bus to determine which 
units require service, and then reads the report data. A task-dispenser 
routine services each unit that responded by examining the report- 
type field and branching to a task designed to handle that data. When 
a task has been handled successfully the controller program resets the 
unit’s reply register and interrupt request. If a report cannot be 
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handled because a buffer is full, the interrupt request for the unit 
remains set, and a retry is attempted later. 

In the reverse direction, the PUC initiates communication with a 
peripheral unit such as MAS by checking the receiving register on the 
communication register pack. If the upper byte of the register is 
nonzero, an earlier order has not yet been acknowledged by Mas, which 
services the receiving register every 250 us. In this case, the order will 
be reattempted later. If the upper byte is zero, then the Puc sends the 
order to the receiving register and places a nonzero value in the upper 
byte of the register. When the MAs controller next checks the upper 
byte, it unloads the receiving register, and zeros the upper byte. 

Two buffers are assigned in puc for communication from MAS to the 
No. 4 Ess processor. Operational reports are loaded in a low-priority 
buffer dedicated to handling single-word reports. (A high-priority 
buffer, used in certain other No. 4 Ess frames, is not used in PUC/MAS.) 
A maintenance buffer sends multiword diagnostic raw data and echo 
reports, which acknowledge all orders sent to MAS. The Puc performs 
protocol checks on reports received from MAS, and reports irregularities 
to the No. 4 ESS processor. 


4.9 Exercise and sanity programs 


To aid in the rapid detection of faults, an exercise program is 
included in the puc. This program is entered every 10 ms by an 
interrupt request generated by the controller clock. The program tests 
all of the controller logical and arithmetic operations. The ability to 
access most of the registers is tested by read instructions. The test 
does not destroy register data and is segmented so it can be interrupted 
by processor orders within 3 ys. Hardware error detectors are used to 
verify proper operation. If a failure occurs, ESR bits are set that alert 
the processor by a peripheral bus maintenance interrupt. The 10-ms 
interrupt is also used to make a maintenance buffer sanity check; 
should a multiword report being loaded in the buffer not be completed 
in a reasonable time, the report is closed so that new reports may be 
loaded. 

4.10 Peripheral Unit Control frame summary 


In summary, the puc contains hardware and microprocessor soft- 
-ware to provide an interface between the mass announcement frame 
and the No. 4 Ess processor. The Peripheral Unit Control frame is 
designed with flexible interconnections so that equipment to provide 
new features may be easily added to the No. 4 Ess hardware commu- 
nity in the future. 


V. MASS ANNOUNCEMENT SYSTEM FRAME 


The as frame contains two mass announcement units, each asso- 
ciated with a disk storage system. The units connect to the extended 
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Fig. 6—Mass Announcement System unit block diagram. 


internal bus of the puc for control purposes, and via 120-channel Ds- 
120 links to the switching and permuting circuits of a Tst frame in the 
No. 4 Ess switching network for recording and playback of recorded 
announcement signals. Figure 6 contains a block diagram of the major 
components of a mass announcement unit. 

New announcements are placed on the disk by allocating appropriate 
sectors and designating them as “standby,” i.e., not currently playing 
back. The announcement is then recorded on one disk over a ps-120 
link channel via a recording buffer. Coordinating messages are sent 
from the No. 4 Ess processor to both disks, and the announcement is 
transferred via an update buffer and a dedicated bus to produce a 
duplicate copy on the disk of the second announcement unit. Later, 
under processor control, both MAS units are instructed to change the 
announcement status from “standby” to “active,” and actual playback 
begins. 


5.1 Playback system 


Kach MAS unit provides announcement playback of 30-s message 
segments which are read from the disk. Any segment may be assigned 
to any of the ps-120 link channels dedicated to playback, and segments 
may be concatenated. Since part of a segment may be silent, the 
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system has the ability to play messages varying from a few seconds to 
5 min in length. Each unit can provide up to 29.5 min total storage of 
announcements ready for playback. 

Although the two units generally store the same announcements, 
the units are duplicates only in a limited sense. Each MAS unit is 
essentially a simplex unit playing back announcements independently 
of the other unit but skewed in time so that when one unit begins the 
playback of a new cycle of 30-s announcement segments, its mate unit 
is at the midpoint of its 30-s segment cycle. The No. 4 Ess system 
connects callers to the announcement unit which first reaches the 
beginning of the required announcement; the skewing reduces the 
average waiting time to 7.5 s (Fig. 7). 

Each 30-s message segment is stored on the disk units in the 64 
kbit/s serial pcm data format used on digital transmission facilities and 
within the No. 4 Ess network. The storage medium used is an 80- 
Mbyte moving-head disk system. Each message segment is allocated 
a three-dimensional portion of the disk storage called a “sector,” which 
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Fig. 7—Message Phase construction. Construction of the six phases of a 90-second 
message stored on disk sectors A, B, and C. 
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comprises all the accessible data within an angular portion of the disk. 
A small section of every active announcement segment is accessed by 
the disk heads upon each disk revolution. 

In operation, the MAS unit reads short pieces of each announcement 
from the disk into individual playback buffers; all active announce- 
ments are served sequentially. The buffers are emptied at a slower rate 
into the appropriate time slots of the ps-120 link, one 8-bit Pcm sample 
being read out per announcement each 125 us. 

Data read off the disk are converted from serial to parallel form and 
error correction is performed based on cyclic codes. Parity is generated, 
and, after an intermediate buffering stage needed because of speed 
differences, the data are transferred to the playback buffer for delivery 
to the ps-120 link. 

Each of the 64 playback buffers is permanently assigned to one of 
the 64 even-numbered pDs-120 channels 0 to 126. Of these channels/ 
buffers, four are dedicated to maintenance activity and up to 14 others 
may be optionally designated as monitor channels. Of the remaining 
64 channels, 14 are permanently assigned as recording channels, two 
are maintenance recording channels, two are loop-back channels to 
the TsI, and the rest are not used. 


5.2 Recording and updating system 


Announcements may be recorded, activated, and deactivated via 
dedicated transmission facilities in the telephone network by a cen- 
tralized administration center (Fig. 8). This function is important in 
the offering of coordinated, nationwide mass announcement services. 
The administration center accepts and stores announcement messages 
from sponsoring telephone companies and commercial advertising 
sponsors, and handles the distribution of these announcements to MAS 
frames in No. 4 Ess offices. 

Since the recording of announcements on the Mas disk may involve 
long distances, transmission checks are done at the MAS end. The voice 
signal is amplitude-compressed at the transmitting end and a pilot 
tone at 2150 Hz is added so that levels can be monitored. Within the 
telephone network the signal is converted to PcM digital format, with 
additional compression according to the » = 255 law quantization 
companding standard. 

At the MAS end, the PcM data arrive on one of the assigned recording 
channels of the incoming Ds-120 link and are routed through digital 
signal processing circuitry which first converts both the voice data and 
the pilot tone back to linear pcm. A noise check on the transmission 
link is performed before and after the announcement data are received. 
Digital filtering is used to separate the tone from the voice data, and 
to adjust the signal level based on the tone level. The signal is also 
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Fig. 8—Direct Distance Dialing network interface. 


expanded to remove the compression which was inserted at the ad- 
ministration center. If the tone level is not within specified limits, the 
network connection is rejected and a retry is requested. The pcm data 
are finally reconverted to » = 255 format, changed to parallel form, 
and delivered to a recording buffer. The data are then written into the 
assigned disk locations. 

Once an announcement has been recorded onto the disk of one MAS 
unit, a transfer or update of the announcement is scheduled for the 
other MAS unit. There are dedicated bus paths for updating in either 
direction between units, with update buffering capable of storing about 
16 s of digitized voice at each unit. At the appropriate time, data are 
transferred in blocks from the originating MAS unit disk through the 
update buffer to the destination MAS unit disk. This transfer allows 
the system to provide duplicated announcements with skewed phasing. 

Provision has also been made in MAS for local recording of announce- 
ments directly, without the assistance of an administrative center. 
This feature is invoked by placing a telephone call to a special tele- 
phone number which results in a connection to an appropriate No. 4 
ESS equipped with Mas. A tone is played back to the producer from 
the recording buffer; this indicates that the desired recorded message 
should begin. Digital signal processing does not occur in this mode; 
message quality is confirmed when No. 4 Ess calls back the announce- 
ment producer and plays the recorded announcement. The producer 
decides whether sound quality is acceptable and either approves the 
recording or repeats the entire process. 


5.3 Controller hardware 


Central to the operation of MAS is a microprogrammed controller 
using bit-sliced architecture. This Peripheral Interface Controller (Pic) 
performs the basic function of data transfer between the moving-head 
disk and the playback and recording buffers. It also controls execution 
of orders from the No. 4 Ess processor via the Puc and generates 
replies. Typical actions include reporting on system and announcement 
status, updating announcements from one unit to the other, and 
performing operational and on-request diagnostics. 

The Pic is a 16-bit microprocessor-based controller designed using 
Advanced Micro Devices, Inc., 2900 series bit-sliced integrated circuits. 
It is capable of a memory-to-memory data move operation in 183 ns. 
The processor also has 4096 words of 18-bit data RAM, eight priority- 
encoded interrupts, a sanity timer, a scratch register, and 17 general 
purpose registers in the arithmetic and logic unit. 

The program for the processor is stored in programmable read-only 
memory (PROM) on three circuit packs. Each instruction is 40 bits 
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wide, including four parity bits. A total of 4096 instructions can be 
stored on a single program store circuit pack. 

The Pic performs 40-bit program instructions resident in its 12K of 
PROM in a pipelined fashion for purposes of speed and efficiency; while 
the present instruction is being executed the next is being fetched by 
the pic’s sequencer. Each instruction’s speed of execution may be set 
to 183 ns or 366 ns by the program itself. Data is transferred between 
ports on an internal bus by specifying source and destination fields 
within program instructions. 

The No. 4 ESS processor communicates with the MAS unit via the 
PuC’s extended internal bus. This duplicated 24-data-bit bus is routed 
sequentially from one MAS to the next, and enables either PUC con- 
troller to communicate with either MAS unit. 

Orders are sent from the No. 4 Ess processor via the PUC over the 
extended bus to the MAS register, and reports of MAS activity leave the 
unit from the reply register. Each MAU interfaces with both buses of 
the duplicated EIB. The interface consists of two bus driver and 
receiver circuit packs and a communications register pack. Each bus 
driver-receiver pack includes in its circuitry an interrupt identification 
code generator and a bus source and destination decoder. An MAS 
frame and unit identification code is wired into these two circuits 
during office installation. 

The Puc may monitor the general health of a MAs unit by reading 
the MAS ESR, part of the communications register. The lowest three 
bits (EIB parity error, invalid controller activity, and communications 
register interwrite error) indicate communications failure between MAS 
and PUC, and result in a peripheral unit failure (F-level) interrupt in 
the No. 4 Ess processor. The remaining 21 ESR bits indicate problems 
of less severe nature and, when set, they generate a request to the PUC 
for service. Three of these bits are directly wired in from the appro- 
priate circuit packs to indicate PIC program memory parity failures, 
program sanity time-outs, and clock errors. The remaining bits are set 
by program tests, and include errors such as playback buffer errors, 
disk control errors, and Ds-120 framing and timing errors. The upper- 
most five ESR bits provide the No. 4 Ess processor with information on 
ASW failure errors which occur when MAS cannot successfully complete 
an order from the No. 4 Ess processor. The first bit indicates ASW 
failure; the remaining four bits form a code that indicates the specific 
problem that the Pic had in handling the order. 

The PUC may place the MAS unit in a particular state by writing the 
unit’s status register, which is bit-writable by the Puc and readable by 
both the Puc and the pic. The Puc can place the MAS unit in a 
maintenance mode or a simplex mode (mate unit out-of-service) by 
setting appropriate status register bits. An initialization bit forces the 
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PIC program address to zero and halts execution. Other status bits can 
be set to mask ESR summaries to the Pic and interrupt requests to the 
PUC. 


5.4 Disk control programs and disk organization 


The Pic program’s principal function is to act as a disk controller; 
the majority of the processing power is spent transferring data between 
the disk and the buffers. Other functions of the program have been 
designed to fit into the structure determined by the disk accessing 
tasks. 

Efficient playback of announcements from the disks is facilitated by 
a regular organization of the announcement storage locations on the 
disk. Since all announcements begin in synchronism, the exact time in 
each cycle when a given section of an announcement must be read is 
predictable. The Pc data for all announcements are interleaved so as 
to minimize the travel of the moving read/write magnetic heads. 

Announcement storage is provided on a five-platter removable disk 
pack controlled by an 80-Mbyte moving-head random-access disk 
drive. The top and bottom platters serve only to protect the three 
operational platters. The three operational platters provide five data 
faces and one clock/servo face. The five data faces are divided into 
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Fig. 9—Disk storage allocation for a 30-second announcement. 
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aligned annular tracks; 276 tracks on each face are used operationally. 
Each track is further divided into 86 angular sectors, with the sector 
boundaries aligned over all tracks and faces. The sets of corresponding 
tracks across all data faces form 276 cylinders. Each 30-s announce- 
ment segment is concentrated within one angular sector, but spread 
across the tracks and faces within that sector (Fig. 9). 

The disk is scanned in 30-s cycles. This is done first by reading all 
active announcement sectors on the outermost track of each data face 
in succession to complete the scan of the outermost cylinder, cylinder 
0. A small portion of each announcement segment has then been read. 
Then the heads are moved to cylinder 2; the process is repeated for all 
even cylinders to cylinder 274. Starting from cylinder 275, the direction 
of the head motion is reversed to scan the odd-numbered cylinders 
while the heads return to the outer rim of the disk. The complete 
process requires 30 s. 


5.5 Task scheduling 


In addition to controlling disk-head movement and disk-data trans- 
fer, the disk-data handler program functions as an executive controller 
to schedule other tasks. These tasks are scheduled for intervals when 
disk-data transfer must be suspended for various reasons. During disk- 
data transfers, the controller is fully occupied with this task. 

During the time that the heads are moving from one cylinder to the 
next or “seeking” the next cylinder, the controller is free to execute 
other tasks. These tasks are called “seek jobs” and are limited to 9.6 
ms in duration. In addition, since the disk-transfer rate into the buffers 
exceeds the rate at which the buffers are unloaded, the disk accessing 
must be suspended periodically or “slipped” so that the buffers do not 
overflow. These suspensions can occur after any track has been 
scanned, except when a seek is pending. During the time that the disk 
access is suspended, approximately 9.6 ms, the controller is free to 
execute other tasks, called “slip tasks.” Slip tasks are reserved for self- 
testing, which are covered later under overall PUC/MAS maintenance. 
Tasks less than 100-ys long can be executed during idle sectors, which 
contain no active announcement data. Finally, a period of about 10 ys 
is available at the beginning of each sector during which no program 
action is necessary to maintain data flow (Fig. 10). 


5.6 Operational and maintenance tasks 


Program tasks in MAS that handle communication with the Puc 
require only a short amount of time but must be executed frequently. 
These actions are covered by a “preamble job” executed in a 10-ps 
period near the beginning of each sector. This task is also scheduled at 
approximately 250-ys intervals by long-duration tasks, such as seek or 
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slip jobs. The preamble job unloads the MAS unit receiving register 
into a queue maintained in PIC RAM, and loads the reply register from 
a second queue in RAM. Operational and maintenance tasks process 
the received orders and generate the replies. 

All operational tasks are scheduled during seek or idle sector inter- 
vals. For example, during every fourth seek the seek task dispenser 
schedules a message-timing task. This task performs announcement 
- phasing and concatenation of 30-s segments during recording and 
playback. The message-timing task also processes information in the 
announcement status, buffer allocation, starting point and sector al- 
location tables to update the sector buffer table, which governs disk 
data transfer. In addition, a unique task is selected for each seek 
interval in the 30-s cycle, and additional “‘once-per-cylinder” tasks are 
executed during each seek interval. 

Self-test tasks are performed during slip intervals. We cover self- 
testing later along with overall maintenance of the PUC/MAS subsys- 
tem. 

Certain short tasks are performed during the 100-ys intervals during 
idle sectors. Two idle sectors per disk revolution are reserved for 
scanning the tone and noise detectors of the digital signal processing 
circuits used in recording. Data are collected which determine the 
average level and noise values for each recording channel. This is used 
to provide automatic gain control; should the signal level or noise 
become unacceptable, the program aborts the recording. 

During other idle sectors, the No. 4 Ess processor order-execution 
routine is called. This routine leads an order out of the receiving 
register queue and passes control to an order routine determined by a 
data field in the order. Typical orders involve allocation or deallocation 
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of sectors and buffers, recording, monitoring, and playback of an- 
nouncements, and duplication of sectors. These orders change the 
state of the message tables, thus, allowing the message timing and 
control routines to properly execute the desired action. Other processor 
orders can be used to read or write PIC RAM locations, diagnose 
particular circuits, and start or stop the unit. 


5.7 Update tasks 


Duplication of announcements between MAS units via the update 
buffer is another task executed by the pic. This transfer is performed 
on a per-sector basis. If a sector is to be updated, a No. 4 Ess processor 
order alerts the unit that acts as the source. During the cylinder 0 task, 
the sector buffer table is altered to begin to load the given sector into 
the update buffer. The processor then instructs the receiving MAS unit 
to set up its sector buffer table to unload the update buffer into the 
correct sector on the disk. The receiving unit begins to transfer data 
from the buffer to its disk 15 s after the source unit begins filling the 
buffer. Since each unit has an update buffer, duplication can occur in 
both directions at once. 

If one unit is taken out of service, its disk must be updated before it 
can be restored to service. This maintenance update is initiated by 
processor order. In this process, the update buffer is used to transfer 
all of the useful data on the in-service unit to the out-of-service unit. 
The process begins when the in-service unit accesses cylinder 0. At 
this time, the in-service unit writes all of its message-timing tables into 
the update buffer. This data is used by the receiving unit to interpret 
the announcement data which is written to its disk from the other 
unit. The in-service disk then begins loading disk data into the update 
buffer one cylinder at a time; the receiving unit then empties the 
buffer. The cycle repeats until the update is complete; this requires 
about 60 s. 


VI. MAINTENANCE FEATURES 


Dependability and maintainability are important considerations in 
the design of the puc and MAS hardware subsystem. These considera- 
tions are in line with the high reliability and maintenance objectives 
of the entire No. 4 Ess switching system. The PUC/MAS maintenance 
plan is integrated into that of No. 4 Ess. Dependability is achieved by 
ensuring rapid detection of failures and by providing hardware redun- 
dancy that enables acceptable service to continue in the presence of 
faults. Maintainability requires that maintenance personnel have avail- 
able automated diagnostic tools to permit rapid isolation and repair of 
failures. 
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6.1 Maintenance architecture 


Differences in maintenance philosophy exist between the puc and 
MAS frames. The Peripheral Unit Control circuit is an interfacing 
circuit on which several MAS frames and possibly other future services 
rely. A puc failure that results in loss of service on all of the connecting 
equipment is intolerable. To avoid this, the Puc is fully duplicated; its 
two simplex halves normally run in synchronism, executing identical 
tasks. Each is fully capable of providing full service should the other 
fail. The synchronization of the two halves complicates the hardware 
design but eases the job of fault detection, as matching between the 
two halves is possible. The Mas frame also consists of two identical 
halves, or units. As has been noted, however, the two MAS units do not 
operate in synchronism; they perform similar tasks but at different 
times. Matching between units is impossible; to aid in fault detection, 
regular self-testing routines are executed. In the event of failure of one 
unit, service continues, but longer waiting times are experienced by 
callers. 


6.2 Fault detection 


Maintenance actions begin with error detection. In the Puc, match- 
ing between controllers, self-checking logic, and internal duplication 
within controllers are employed to achieve a high level of on-line 
immediate detection of transient and permanent errors. Protection 
against data transmission errors is provided within all the controller 
data paths using coding techniques, loop around, and hardware check- 
ers. All data in memory (RAM/ROM) are coded and checked. Critical 
portions of the hardware processors (arithmetic and logic unit, main- 
tenance microprocessor, and internal bus multiplexer) are duplicated 
and matched. A local EsR (hardware monitor) is provided for each 
major functional unit to allow high resolution of error location. Exer- 
cise and pest registers are employed to control and test the hardware 
monitors. Errors in the controller are summarized in a primary ESR, 
which, when set, causes a maintenance interrupt to the No. 4 ESS 
processor. The processor calls fault recovery programs for appropriate 
actions. 

The MAS units use similar fault detection techniques to those used 
in PUC, except that matching between units is not possible and addi- 
tional self-testing is required. Hardware faults in MAS result in bits 
being set on the MAS unit’s EsR, which, in turn, immediately sets a Puc 
ESR bit. 


6.3 Mass Announcement System self-test 


Since the MAS unit does not run in step with its mate unit, matching 
cannot be used for operational error detection. This requires that a 
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sufficient number of audits, checks, and tests must be written into the 
operational firmware program along with dedicated self-test hardware 
to detect faults during normal operation of the unit. Parity is used on 
RAM in the unit to aid in error detection. Also, additional hardware 
access and looping capability has been provided to allow the firmware 
to more easily test the various hardware modules. 

The firmware self-test tasks are executed during seeks, slips, and 
certain dedicated idle sectors. Some disk sectors are not used for 
operational purposes; this allows execution of various miscellaneous 
tasks to provide self-testing. One such task is a scan of error counters 
which are decremented by certain operational and data handler rou- 
tines when an error is found. These counters are loaded with some 
initial value; a negative count causes an error to be reported. This 
technique reduces the effect of transient errors and also reduces the 
load on real-time critical processes. Peripheral Interface Controller 
RAM is checked by performing access tests on RAM data and address 
registers, and parity and hash checks over software protected areas. 
Access tests are also done on all playback and record buffer registers. 
Maintenance buffers are used to do partial memory testing. The disk 
and disk hardware is checked by accessing a dedicated idle sector; 
random data is continuously written, read, and verified on this sector, 
and every track of that sector is processed in a 60-s period. Disk data 
itself is protected by a powerful error detecting and correcting cyclic 
redundancy check code, capable of correcting burst errors of up to 11 
bits in length. 

A playback and recording loop test is also performed. Data from the 
playback buffer is looped to a recording buffer and then verified. This 
loop feature is also used to test the digital signal processing circuits. 
Various dc levels and tones are looped through these circuits to ensure 
that the proper filtering action is taken. Since the signal-processing 
circuits are used in a time-multiplexed fashion, the circuits can be fully 
tested by using a maintenance time slot at the same time other time 
slots are being used operationally. 

All slip tasks are dedicated to self-test. The slip task dispenser 
monitors the ps-120 framing circuit and controls update buffer testing. 
The tests executed include an update buffer register test, a march test 
on the memory fabric, and a check of the cross-unit update access 
circuits. Update tests are not executed if update work is in progress. 


6.4 Diagnostic software 


Diagnostic software is available for both puc and Mas. It can be used 
under control of maintenance personnel as an aid in fault isolation. 
Certain portions or “phases” are invoked automatically before out-of- 
service hardware can be restored to service. Diagnostic software in 
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PUC/MAS is, in some cases, executed by the No. 4 Ess processor, and 
consists of a series of tests which send orders to PUC/MAS and evaluate 
responses. Other portions are initiated by processor orders but are 
executed by programs in PUC or MAS. 

The Puc is a multiprocessor controller employing a hardwired pro- 
cessor, bit-sliced microprocessor, and a maintenance microprocessor. 
The diagnostic programs have been developed to suit this multi- 
processor structure. The diagnostic software contains No. 4 ESS pro- 
cessor resident diagnostic programs that test the Puc front-end pro- 
cessor logic, power, clocks, the interface between the Puc and the No. 
4 ESS processor, and the synchronization between the duplicate Puc 
controllers; and executive controller resident diagnostic routines that 
are invoked by No. 4 Ess processor orders. These routines perform 
tests on the arithmetic and logic unit, the priority interrupt encoder, 
the microsequencer, and in general on the logic that is directly under 
control of the executive controller ROM. These test results are passed 
to the No. 4 Ess processor diagnostic programs for analysis and 
decisions. Diagnostics are also resident in the PUC maintenance pro- 
cessor. These are executed by the maintenance processor under macro 
commands from the No. 4 Ess processor. These programs perform 
localized self-testing on the mp hardware, and also interact with the 
executive controller to diagnose its hardware. Maintenance processor 
test results are passed to the No. 4 Ess processor via the maintenance 
buffer. 

The MAS diagnostic is composed, like the Puc diagnostic, of a No. 4 
ESS processor resident part and a firmware part. The processor resident 
part provides all necessary interfaces between the Puc unit and the 
rest of the Ess system. The diagnostic first checks power and the Puc/ 
MAS interface circuits; these phases are processor resident. The MAS 
firmware part of the diagnostic contains some tests which run only on 
specific processor orders. Other portions allow the self-test firmware 
and hardware to operate for a period of time after which the No. 4 Ess 
processor resident program checks for accumulated errors. Thus, all 
self-test routines, checks, and audits are designed to serve as part of 
the diagnostic, as well as for operational fault detection. In this way, 
the amount of extra diagnostic code is minimized and failures detected 
operationally generate useful fault-related data to help repair the unit. 


Vil. SUMMARY 


We have described a No. 4 Ess hardware subsystem that adds a 
flexible capability of recording and playing back announcement mes- 
sages. The subsystem is generally under the control of the No. 4 Ess 
processor, and has several internal microprocessor systems for control 
and maintenance purposes. Flexible circuits for interfacing the an- 
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nouncement system to the No. 4 Ess processor allow for the economical 
future addition of new equipment. The announcements themselves are 
stored in digital form on moving-head magnetic disk systems; the 
organization of the stored data is designed with particular care for easy 
access during announcement playback. System reliability is a major 
consideration; error detection, self-test, and diagnostic systems are 
important components of the subsystem. 
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No. 4 ESS: 


Network Clock Synchronization 


By R. METZ, E. L. REIBLE, and D. F. WINCHELL 
(Manuscript received August 4, 1980) 


Accurate clock control is a necessity for direct digital transmission 
of voiceband data between No. 4 Ess offices. This paper describes 
synchronization of the No. 4 Ess clock, which consists of phase- 
locking the clock oscillators to an external frequency reference. The 
operation performed is a second-order digital phase-lock with time 
constants of 1.4 hours and 3.12 days. In addition, a fast-start mode is 
provided with time constants of 2.6 minutes and 8.31 minutes. These 
characteristics provide tracking of frequency shifts due to daily 
propagation delay variations on transmission facilities, while filter- 
ing out higher-frequency jitter components of the references. They 
also guarantee stable convergent operation, even with the unlikely 
worst-case of linear oscillator drift. Implementation consists of a unit 
containing two matched and synchronous microprocessors and a 
microprogrammed controller. While the phase detector is in hard- 
ware, the microprocessors provide the remainder of the loop. Firm- 
ware performs the filtering algorithm, oscillator control, unit diag- 
nostics, as well as extensive defensive operational checks. A great 
deal of effort is made in both hardware and firmware to ensure the 
integrity of data written to the oscillators. Finally, experimental 
results show the unit operation tracking design predictions closely. 


l. INTRODUCTION 


The No. 4 Ess is a digital toll switching system, whose time-division 
network routes standard 8-bit pcm signals.’ Digital interfacing of the 
network to T-carrier facilities is provided by the Digital Interface 
Frame (DIF), which converts and concatenates a number of T1 facilities 
into higher-speed serial bit streams for the switch, and vice versa. 
Thus, the basic timing for the switching network is the 8-kHz frame 
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rate typical of T-carrier facilities. Timing for the network is provided 
by the Network Clock (NCLK) frame, which distributes a 16.384-MHz 
square-wave pulse train to each of the network frames. The 8-kHz 
framing information is transmitted as a missing pulse in the 16.384- 
MUz signal once every 125 ps. Synchronization of the network consists 
of controlling the frequency of the NCLK such that No. 4 Ess offices 
that are digitally connected run as close as possible to the same 
frequency. This is done by phase-locking the clock oscillators to the 
externally supplied Bell System Reference Frequency (BSRF) or to a 
T1 line from another No. 4 Ess. Thus, the No. 4 Ess is part of an 
overall system timekeeping plan consisting of a master-slave hierar- 
chical timing structure.” The BsRF is the master timing source, distrib- 
uted throughout the country to clusters of digitally interconnected No. 
4 ESS switches. One switch in each cluster is designated as a master 
and is phase-locked to the BsRF, while the other switches are slaved in 
a tree-like structure to the master via timing carried in the digital 
interconnections. 


li. SYNCHRONIZATION REQUIREMENTS 


When two No. 4 Ess offices are directly digitally interconnected by 
a T-carrier facility, data arrives at a DIF at a rate determined by the 
source NCLK, and is read out of buffers into the network according to 
the local NcLK. Thus, differences in clock frequencies between the two 
offices result in DIF buffer overflow or underflow. This is compensated 
for by the loss or repetition of a frame of data, and is called a slip. The 
impairment to PCM voice is negligible for slip rates as high as several 
per second, while the effect on voiceband data is more drastic. Any 
slip is an undesirable loss of information, and thus, the end-to-end slip 
rate objective for the Switched Digital Network has been set at one 
slip in 5 hours.* One-half of the objective (one slip in 10 hours) is 
allocated to digital transmission facilities. The remaining half of the 
objective is allocated to local digital switching systems. Thus, essen- 
tially no slips are allowed for digital toll switching systems. 

The drift rate of the No 4 ESS NCLK is somewhat better than one 
part in 10” per day, which means that oscillator adjustment is neces- 
sary. 

With the synchronization unit described here, the No. 4 Ess experi- 
ences an essentially zero slip rate, and if the No. 4 Ess is forced by 
trouble to run for two weeks without an external reference, it still 
experiences less than one slip every 20 hours. 


Ill. NO. 4 ESS NETWORK CLOCK CHARACTERISTICS AND 
ARCHITECTURE 


The source of the 16.384-MHz clock frequency is a set of four double- 
oven, quartz-crystal, voltage-controlled oscillators, connected in an 
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analog master-slave phase-locked arrangement.’ (See Fig. 1.) One 
oscillator is designated master, and the other three are phase-locked 
slaves. 

Phase error detectors and voting circuits are designed to implicate 
failing oscillators and reconfigure the master-slave arrangement as 
necessary. In addition to the analog frequency-control input, each 
oscillator has a 14-bit digital control, capable of adjusting the frequency 
a maximum of + four parts in 10’, with a nominal least-significant-bit 
sensitivity of 5 x 107!!. Finally, the oscillators each provide two phase 
bits, indicating whether the oscillator, if it is a slave, is lagging or 
leading the master in phase by more than 0.5 degree. 

The Network Clock Synchronization Unit (NcsU) controls the clock 
frequency via the 14-bit oscillator inputs. While the master is being 
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Fig. 1—Network-clock block diagram. 


NETWORK CLOCK SYNCHRONIZATION 1111 


phase-locked to the reference signal, the three slaves are tracked to 
follow the master with the help of the phase bits. Slave updating 
algorithms keep the slaves within +0.5 degree (at 16.384 MHz) of the 
master. 


IV. PHASE-LOCK ALGORITHM AND TIME AND FREQUENCY DOMAIN 
PERFORMANCE 


Since stability in time (or phase) of the 8-kHz frame is what is 
ultimately important in preventing slips, phase-lock, as opposed to 
frequency-lock or some other method, was chosen. The phase-lock 
scheme is similar to what is used in the Digital Data System.° 

The major components of the loop are the phase comparator, filter 
section, and the master oscillator. (See Fig. 2a.) 

Inputs to the phase comparator are a 4-kHz reference signal, 8-kHz 
frame pulse, and 4.096-MHz clock, the latter two derived from the 
local oscillator in No. 4 Ess. A phase comparison consists of starting a 
counter with the leading edge of the 4-kHz reference signal, and 
subsequently stopping it with the next 8-kHz frame pulse. The 4.096- 
MHz clock is counted in between. (See Fig. 2b). Before each count, 
the counter is preset to —256, thus, yielding a phase comparator output 
range of —256 to +255 for a given comparison, where 0 corresponds to 
zero phase error and the exact half-way interspersing of 8-kHz frame 
pulses and 4-kHz reference pulses. It is evident that a phase comparison 
is done every 250 ps. 

During a typical 8.192-s interval, 2’° such phase measurements are 
made, added, and divided by 2” to yield an average phase error for the 
8-s interval. Deviations from this are described later in the operational 
firmware. This average phase measurement is then multipled by 27” 
and added to a running sum called the integral term. Finally, the 
running sum is added to the average phase measurement, forming a 
14-bit frequency control word which is sent to the oscillator. 

Since the 8.192-s sampling interval is relatively frequent compared 
to the time constants of the loop, we can model the system in a 
continuous form. (See Fig. 2c.) Note that or is the phase of the 
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reference signal, ¢, the local frame pulse phase, and ¢< the difference 
between them. The multiplier in the integral branch is f£, and the 
oscillator sensitivity and phase comparator gain are combined into a 
common term a. Therefore, the phase of the local frame pulse can be 
expressed as 


o1(s) = . a(1 - A eet, (1) 
s Ss 
and the phase error is 
s? 
z(s) — ea as 4ar - br(S). (2) 


To determine a, we consider the following: Since the phase comparator 
has a linear region of 125 us, corresponding to 512 bits, each bit 
represents 244 ns. A 1-bit change causes the oscillator to change 
5 x 107". Thus 


a= 5 X 107/244 ns = 2.048 x 107*/s. (3) 
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Another way to think of this is in the open-loop sense: Given a 1-bit 
change in oscillator input, how long will it take for the phase compar- 
ator output to change by one bit? It requires 244 ns of phase shift per 
bit of comparator output, which corresponds to four cycles of 16.384 
MHz. Therefore, 


t= : = 136h= z (4) 
5x10" x 16.384 x 10° a 

Since the scaling factor at the input of the integral path is 27", and 
updating of the oscillator is done every 8.192 s, B is given by 

2 1 

B= mig 3.73 X 10° °/s and B = 3.1 days, (5) 

which is the time constant of the integral portion of the phase-locked 
loop. 

A fast-start mode is provided for synchronization under startup 
conditions. It has the ability to resolve much larger frequency offsets 
between the clock oscillators and the incoming reference. It is char- 
acterized by shortened time constants and a much wider capture range 
(+ 4 parts in 10’), 


a’ = 82a =6.55x 103, = 1/a’ = 2.6 min, (6) 
BP’ =5128=19x103, 1/8’ = 8.13 min. (7) 


Figure 3 shows the computation algorithm for both fast-start and 
normal modes. 


V. PHASE AND FREQUENCY RESPONSE 

Since it is phase drift in the 8-kHz framing of the No. 4 Ess clock 
that will cause slips on transmission facilities, it is the phase response 
of the synchronization unit that we are ultimately interested in. There- 
fore, let us first determine the response to an input step in frequency. 
The phase error, in time, is given by 


ge(t) = a ena sin(dt), (8) 


a a?\/? 
a= > b= (ap - 5) 7 


Figure 4a shows that the phase error builds up with a time constant of 
1/a to a peak of 4.45 kHz/Hz at 19.92 ks and then dies out with a time 
constant of 1/8. 

Correspondingly, the frequency response to a step in frequency is 
shown in Fig. 4b. The initial error is approximately AF, heading 


where 
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Fig. 3—Synchronization unit—main algorithm. 


through zero with time constant 1/a, and then decaying to zero with 
time constant 1/8. Note that in the limit, both the frequency and the 
phase error diminish to zero. 

Next, if the input takes a step in phase, the phase-error response is 
given approximately by 


oe(t) = (-e aa c -*) - Agr. (9) 


As shown in Fig. 5a, the initial offset is of course Agr, decaying 
predominantly with the 1/a time constant. The effect on frequency is 
shown in Fig. 5b. 

Finally, we can consider what happens if there is a linear frequency 
drift in the input signal (or the oscillator output). Although this linear 
drift is an unlikely situation, it forms a worst-case bound on what the 
oscillator or the references can do, short of a faulty condition. For a 
drifting input, moving at AF Hz/s’, 


2 2 2 
oe(t) =— + et 
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Fig. 4—(a) Phase response to input step in frequency. (b) Frequency response to 
input step in frequency. 


Note that at zero time, the phase error starts out at zero, and in 
time builds up to a constant offset of 2/a8. This is important, since it 
determines the maximum phase error for a given drift characteristic of 
the oscillator. For example, if the worst-case drift is 1 x 107’° per day, 
the constant long-term phase error would be about 3 us, or 12 bits from 
the phase comparator. This is tolerable, indicating that in the worst- 
case, the No. 4 Ess frame might shift a constant amount of “%o of a 
frame, and in no case will phase continuously drift. 


VI. SYNCHRONIZATION UNIT HARDWARE DESIGN 


Figure.6 is a block diagram of the synchronization unit hardware. 
The heart of the system is a pair of microprocessors with a fully 
duplicated Read Only Memory (Rom), Random Access Memory (RAM), 
I/O, and interrupt complex. A bit-sliced microprogrammed controller 
has Direct Memory Access (DMA) and interrupt access to the micro- 
processor community, and performs unit control functions, as well as 
provides a duplicated interface to the No. 4 Ess Central Control (cc) 
processor via the peripheral bus. Small microcoded programs in the 
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controller respond to 76 different diagnostic and control orders from 
the cc by performing microprocessor DMAS and interrupts, data rout- 
ing, register loading, etc., as required. Three orders are used by the 
controller for self-diagnosis, one exercising the instruction set, stack, 
and program counter, the other two testing the program parity, and 
parity check circuits. In addition to a conventional check of good 
program parity, one of these two orders increments through a program 
section where every line has bad parity in a vector form that completely 
tests the parity tree and check circuits. Parity should fail for every line 
of this test. 

During phase-lock operation, the phase comparator circuit generates 
an output every 250 us, representing the relative phase of the incoming 
reference and local frame. At initial startup, and after a phase “hit,” 
the phase may be arbitrarily adjusted, under program control, by two 
special phase build-out circuits, one for T1, the other for BsRF. This 
maximizes the capture range of the loop, and minimizes phase excur- 
sions at startup and after reference hits or losses. 
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Fig. 6—Synchronization-unit hardware block diagram. 
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The operational program then reads the 250-us comparison samples, 
and performs the algorithm described before. Oscillators are updated 
every 8.192 s nominally, plus computational and administrative over- 
head. 

It is extremely important that bad data should never be written to 
an oscillator. Several levels of hardware and firmware protection are 
employed to ensure the quality of oscillator writes. The first and most 
important is the duplication of the microprocessors, including 32 K of 
program ROM, 4 K RAM, 51 1/O ports, and the interrupt hardware. The 
processors run synchronously and parity over address and data is 
matched between the two. A mismatch inhibits writing to the oscilla- 
tors, interrupts and halts the processors, and sets an error in the error 
register. To further protect the data sent to the oscillators, I/O ports 
containing oscillator data are cross-coupled: That is, when a processor 
writes a port and then reads it back, it reads back that port from the 
other processor. Thus, each processor can match its data against the 
other, and a write is performed only if the data is the same. Even then, 
a write to an oscillator will succeed only if it is performed exactly 
simultaneously by both processors. After the write, the data is read 
back from the oscillator and is once again verified against what was 
calculated. 

The cross-coupled I/O ports are also used by the diagnostics to 
verify programs and data stored in ROM. Each processor can read out 
its own ROM and exchange it with the other via these ports and, thus, 
check the two for equality. 

Another hardware feature that aids in diagnosing the processors is 
a pair of address cross-coupled read/write I/O ports. When micropro- 
cessor 0 writes ports A and B, and then reads them back, it reads A 
and B, respectively. Processor 1, however reads them back as B and A, 
respectively, effectively doing an address interchange of data. A section 
of diagnostic code uses these ports to create differential code in RAM, 
which is subsequently executed to exercise and check the bus parity 
match circuits. 

To monitor performance of the unit, a status panel provides a 
readout of the current phase error and integral term, as well as 
indications of the operating mode and reference in use. An EIA com- 
patible interface on the status panel also allows monitoring of the 
synchronization process by a terminal. This terminal interface, in 
conjunction with about 5 K bytes of the program, provides various 
monitoring modes, as well as firmware utilities, described later. 


Vil. FIRMWARE 


The synchronization unit microprocessor-based firmware is com- 
prised of three parts: Operational programs, diagnostics, and utilities. 
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The purpose of the operational program is to implement the phase- 
lock algorithm. The diagnostic portion performs tests on the hardware. 
The utilities provide the tools necessary for program development. In 
the following sections, these three areas will be described in detail. A 
hierarchy chart appears in Fig. 7, and a state diagram in Fig. 8. 


7.1 Operational firmware 


As shown in Fig. 7, the operational program consists of two parts, 
the “main loop” and the interrupt handlers. The main loop is where 
the unit usually resides, and is where the phase-lock algorithm is 
performed. The interrupts are the means by which the main loop is 
entered, or they change the operating environment, or they perform 
various other tasks related to synchronization. 

A state diagram is shown in Fig. 8. The “phase-lock mode” state is 
where the main loop is executed. The processors are in a “halted” state 
after reset and trap events. In the “free run” state the oscillators are 
not written. Certain events (like interrupts) cause a transition from 
one state to another. 
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Fig. 7—Firmware structure. 
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7.1.1 Main loop 


The main loop is an 8-s cycle that performs the synchronization 
algorithm in Fig. 3. In this loop, a phase measurement is made, the 
master written, and the slaves written. When the synchronization unit 
is in fast start or normal mode, this loop is executed repeatedly. There 
are seven main parts to the cycle (Fig. 7). 

The first part of the main loop consists of the miscellaneous tasks. 
One such task is the maintenance of event counters such as: 

(t) The number of 8-s cycles executed since the beginning of fast 
start. 
(tt) Time elapsed since the last power up. 

(tit) The duration of reference outages. 

Also, the I/O is read for the master oscillator number. The RAM is 
read in a special location called the reference register to see whether 
BSRF or T1 is the reference signal to use, and bits in the I/O are written 
to select that signal. An I/O port is read to find out which references 
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have outages and another I/O port is written to screen irrelevant or 
unwanted interrupts. The integral terms and oscillator words are 
copied into a special area of RAM which is read by the cc and put into 
the “critical registers” for the synchronization unit. 

The next part in the process is the calculation of the master’s new 
14-bit control word. This program first decides whether to call a 
routine to perform a phase build out. (The shifting of the reference 
signal to produce a desired phase error.) For the beginning of fast start, 
it is necessary to build out to a phase error of zero, whereas after a 
phase hit we build out to the previous phase measurement. T1 and 
BSRF build out are performed by different hardware. From the firm- 
ware’s point of view, the build out algorithm is as follows: 

(i) Write the build-out register (T1 or BSRF) to 0 and take a phase 
measurement, A. 

(ti) If B is the desired phase error, then write the build-out register 

to C, where 


A-B 
16 
_A-B 
~ 16” 





+ 32, A< B, 


Cc A>B. 

The accuracy of this procedure is +9 bits of phase error. For the 
BSRF, the accuracy is enhanced to +3 bits by using a diagnostic access 
port to cut off the BSRF signal for 13 ys, which causes a shift in the 
reference signal as it appears at the phase comparator. This is done 
repeatedly until the desired accuracy is achieved. The process never 
requires more than seven iterations. 

The next step in the process is taking a phase measurement. The 
phase comparator comes up with a phase measurement every 250 us. 
The phase error takes on the values between —255 and +255, and 
appears in a 16-bit I/O port. A firmware routine forms the sum of 2” 
consecutive phase measurements (takes 1 s) and then computes the 
average phase error. During the process, adjacent 250-us phase mea- 
surements are checked for jitter by requiring that they differ by no 
more than two bits. If they do differ by more than two, then the event 
is recorded and the summing routine starts over. Starting over too 
often results in an error reported to cc. 

The time spent updating slaves can consume a considerable portion 
of the 8-s cycles, therefore, the phase measurement time is varied to 
compensate by subtracting the time spent updating slaves in the last 
cycle from 8 s. The resulting time dictates the number of consecutive 
1-s phase measurements which are made, which varies from 1 to 7. 
The average phase error is then computed from the 1-s averages. The 
difference between adjacent 1-s phase measurements is monitored, 
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and, if it exceeds three, then a phase hit has occurred, and a build out 
to the last good phase measurement is performed. 

The 8-s phase measurements are monitored from cycle to cylce. The 
decision to transfer from fast start to normal mode is made based on 
the magnitude of the phase error (must be = 1) and the slope of the 
phase error (must be <= 1/80 bit/s). In normal mode, if the slope of the 
phase error curve exceeds 164/40 or 164/400 or 164/4000 bits/s, then 
an error is reported to cc and the unit enters a free run state. If the 
magnitude of the phase error exceeds 230 at any time, then an error is 
reported and the free run state entered. 

Having gone through the checks in phase error, the next step is to 
update the integral term (see Fig. 3). The new integral term is equal to 
the old one plus a constant, A, multiplied by the phase error. The 
master’s word calculation follows as the integral term, plus another 
constant, B, multiplied by the phase error. The integral term, a 29-bit 
quantity, can be thought of as the 3-day average of the oscillator word, 
a 14-bit quantity. 

Before writing the master with the just-calculated word, a basic 
check is made for a reasonable value. If the word presently in the 
master differs from the new word by more than 256 bits (4096 for fast 
start) then processor-detected error is set (a bit in the I/O which 
results in an interrupt to the cc) and free-run mode is entered. If the 
difference is greater than 64, then the master is walked by steps of 64 
to the desired value, with the slaves updated at each stage. This 
prevents NCLK frame detected phase errors from occurring. 

Anytime an oscillator is written, a common routine is called which 
performs some defensive checks and then writes the oscillator. The 
first check made is that the word to write cannot differ by more than 
256 from the word in the oscillator. The second is that both processors 
must agree on the word that is to be written. This is accomplished by 
using a crossed I/O port where each processor can read what the other 
wrote. 

The next line of defense is in the hardware. The program writes an 
I/O port, which will strobe the data in the crossed ports into the 
oscillators if (1) the oscillator clamps (cc-controlled gates which, when 
set, inhibit oscillator writing) are not set, (2) the processors are 
synchronous, and (3) there are no errors in the hardware error source 
register. 

The last line of defense is that the program always executes a 100- 
ms pause between oscillator writes. This allows time for the NCLK 
frame to detect a phase error and interrupt the cc, which, in turn, will 
set the oscillator clamps so that no more writes can be made. Since 
the clock frame can run on three oscillators, no degradation of service 
occurs—only a loss of redundancy. 
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The next step in the process is to deal with the slaves. There is an 
I/O port which reflects the state of the oscillator clamps. Those slaves 
which are unclamped are updated. As a first approximation, whatever 
frequency control word offset was last written to the master is written 
to each slave. Next the oscillator phase bits are used to more precisely 
align the slaves to the master. 

There is a 2-bit phase indication (slave relative to master) in the 
I/O for each slave. These take on the values 0, 1, or 2 depending on 
whether the slave is within +.5 degrees of the master, greater than 
+0.5 degrees, or less than —0.5 degrees, respectively. The zero condition 
window, usually about 50 oscillator word-bits wide, is measured pe- 
riodically. If a slave is found with nonzero phase bits (after writing the 
master’s change to it), then one-half of the window width is added or 
subtracted to the word in the slave. If the phase bits are still nonzero 
for the slave, then it is walked towards its window until the phase bits 
become zero. 

The next step in the main loop is the writing of periodic data to the 
terminal. The following information is written: phase error; the four 
oscillator words; the integral term; which oscillators are enabled; the 
master; the mode (fast start or normal mode); the reference (BSRF, or 
T1); and the cycle counter. 

As part of the main operation loop, routine exercise is performed 
every hour when the unit is in normal mode. The following diagnostic 
routines are called: Rom test, I/O test, processor self-test, phase com- 
parator test, BSRF interface test, T1 interface test, oscillator buffer 
test, and the oscillator write tests. Any failures result in a processor- 
detected error and free run. 

A set of performance reports is maintained. These reports include: 
number and duration of free runs, reference outages, phase hits, 
frequency offsets, and fast starts. 


7.1.2 Operational interrupts 


The operational interrupt handlers perform various functions while 
the unit is in a phase-locked mode or they may be used to place the 
unit into a phase-lock mode. The interrupts come from the cc or 
directly from hardware in the unit. The interrupt handlers are listed 
in Fig. 7. 

Init is an order from cc to initialize RAM and the I/O ports. Fast 
start, normal, and free run are all orders that cause the unit to enter 
the respective states. During free run, the program is in a loop which 
increments a free-run duration counter. The reference change order is 
used by the cc to load the reference register. 

The terminal handler processes requests from the terminal. A set of 
commands exist which allows one to change the terminal printing 
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format, have the machine performance report printed, run routine 
exercise, or print information about the last processor-detected error. 
In addition, the status panel may be placed in a trace mode, where the 
present program state and last interrupt processed are displayed. With 
another command one may exit to the executive program. 

The dead-slave interrupt is used to bring on line a new oscillator, or 
one whose frequency is very different from the master’s. In this region, 
the slave is out of phase-lock with the master and the 2-bit phase 
indication in the I/O is meaningless. The phase comparator is used to 
measure the frequency difference between the slave and the master. 
The output of the slave is used for the reference input to the phase 
comparator and then the rate of change of phase error measured. 
Based on the frequency difference calculated, a new slave word is 
calculated and written to the slave. The process continues until the 
clock frame is able to phase lock the slave and then the phase bits are 
used to bring the slave in. 

Reset is the highest level of interrupt. Upon receiving this interrupt, 
some code is executed which initializes variables and then the proces- 
sors halt. Interrupts from the halted state result in synchronous 
operation of the processors. 

Executing an illegal oP code results in a trap. The trap handler 
records the program counter in RAM, prints it on the terminal, and 
then halts. 


7.2 Diagnostics 


The diagnostic routines are listed in Fig. 7 and are briefly described 
here. 

The memory tests are divided into ROM check and RAM check. For 
ROM check a crossed I/O port is used where each processor reads what 
the other one wrote. By reading each Rom location, writing the data to 
the crossed port, and reading the crossed port, the processors compare 
the two ROMs. 

The RAM test uses a conventional walking 1’s and 0’s algorithm. In 
order to discover a wider class of faults, the RAM test routine uses I/O 
ports for scratch pad memory, rather than RAM. The read/write I/O 
ports have been previously tested by cc-based diagnostics. 

A processor self-check is implemented in the firmware. Here the 
instruction set is verified for various addressing modes by performing 
sample problems and comparing results against constants. 

The phase comparator, BSRF interface circuitry, and T1 interface 
circuitry are tested by firmware routines. The build-out circuits are 
also tested by these routines. Test vectors are applied through diag- 
nostic access points in the I/O. 

The ability to read and write oscillators is verified by a firmware 
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routine. Also, the performance of the oscillator phase bits is checked. 
The clamps between the synchronization unit and the oscillators are 
also tested here. 

Other diagnostic routines test the processor match circuitry, the 
address comparator, and the terminal interface. A routine can be called 
by cc, which performs a hash sum over the memory to verify the 
version of the program. 


7.3 Utility program 


A utility (executive) program is also included in the synchronization 
unit firmware. This program deals with commands typed in at the 
terminal. The commands enable one to read and write memory, load 
program in ROM-emulation, set break points, add patch code, and 
initiate execution of program. The utility program is a useful tool for 
program development as well as hardware troubleshooting. 


Vill. PHYSICAL DESIGN 


The Ncsu is a J-coded 8-inch-high by 36.25-inch-wide unit, using 1A 
Technology hardware. The unit contains 18 “FG” type circuit packs, 
each being uniquely coded apparatus with two codes having additional 
MC coding for the documentation of firmware. These packs are ar- 
ranged in three side-by-side apparatus housings occupying the major- 
ity of the unit. (See Fig. 9.) The circuit packs are connected by 
backplane wiring. In the case of the microprocessor bus leads and all 
of the +5 volt power and ground distribution, the wiring is via the 
multilayer printed wiring board. The bus leads being in the multilayer 
board provide more noise immunity than possible with wired connec- 
tions. The natural flow of these leads would form long parallel hori- 
zontal runs. The danger in so much parallel exposure is crosstalk. To 
counteract the natural flow, a system of routing the leads as shown in 
Figure 10b was adopted. Contrast this with the natural flow as shown 
in Figure 10a; note that while the length has been slightly increased, 
the amount of parallel exposure has been cut in half. Figure 11 is a 
photograph of the A section of one the layers of the multilayer printed 
wiring board, showing the actual routing. 





Fig. 9—Network-clock synchronization unit (front view). 


1126 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1981 


CODCOD PODVDVDDDDOONDONONO® 
COOOCOVGCOUCOODOCOOD 00060 


CODDOPVODOVDODDDOONOOOO® 
COO OOOO OOOO O000 00000 


(a) 


OOOO DVODOVOOPVVTWOVOOODVOOVOO 
DOWOODWOOOOMDVVODOIADODDNOWO0OO 
OOONODVOAOVCOGVNVTIVODODOPMOOVOO 
OO 0'O 00000 0:.0:0'00'O'O0:00 0 


(b) 


Fig. 10—Patterns. (a) Natural flow. (b) Anticrosstalk. 


The unit power consists of a 140-V to 5-V power converter, power 
control relays, power switch, and two circuit packs. 

The unit also contains a status panel (see Section IV), a terminal 
strip, and the jack for the BsRF. 


IX. TESTING AND PERFORMANCE EVALUATION 


To verify the phase-lock algorithm and its implementation, an 
experiment is performed. We start with an oscillator whose frequency 
is within 5 x 107”! of a stable reference. Then the oscillator’s word is 
changed by 3814 bits or a frequency offset of 1.83 x 107’. The unit is 
placed in the fast-start mode and the phase error versus time history 
is recorded. 

Using eq. (8) with a = 6.29 x 10°°/s and 8 = 1.96 x 10°°/s (for a 
cycle time of 8.0 s and an oscillator sensitivity of 4.8 x 107!'/bit), we 













































































Fig. 11—Actual bus routing. 
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Fig. 12—Fast start. 


can calculate that the peak phase error should be 85 bits of phase 
comparison and should occur at t = 296 s. The experimental result is 
a peak of 87 at t = 284s. 

The above experiment is repeated for normal mode with a frequency 
offset of 200 oscillator control word bits, which is equal to 0.962 x 107°. 
The experimental peak phase error is 188 and the predicted is 188.2. 
The peak time for the experiment is 346 min versus a predicted time 
of 344. 

The theoretical phase error expression is normalized with respect to 
the theoretical peak value, and the resulting curve plotted in Fig. 12 
for fast start and Fig. 13 for normal mode. The normalized data is 
plotted as x’s in the figures. The agreement between theory and reality 
is quite good. 
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Fig. 13—-Normal mode. 
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X. SUMMARY 


Synchronization of the No. 4 Ess is performed by phase-locking the 
oscillators in the NCLK to the Bell System Reference Frequency or to 
the framing on a T1 line from another No. 4 Ess. Thus, the No. 4 Ess 
is in the upper layers of the treelike timing structure of the Switched 
Digital Network. A digital second-order phase-lock is used, exhibiting 
excellent convergence and stability characteristics. It is implemented 
with duplicated and matched microprocessors as part of the loop. This, 
along with extensive operational and hardware checks ensures a high 
degree of confidence in data written to the oscillators. Most diagnostics 
for the synchronization hardware are contained in the microprocessors. 
Finally, experimental testing of the unit has shown actual performance 
to conform very well to the predicted time domain response. 
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This article describes the hardware and maintenance software 
implementation used for the Digital Interface (DIF), a new transmis- 
sion interface for the No. 4 Ess. The DIF replaces the Digroup 
Terminal (DT) and the Signal Processor 2 (sP2) used for terminating 
digital carrier trunks. It also provides a more economical transmis- 
sion interface for the No. 4 Ess, enhances the use of the LT-1 connector 
for terminating analog carrier facilities, and provides a standard 
peripheral/processor interface for maintenance. In conjunction with 
the piF, a hierarchical modularly structured maintenance software 
system was introduced to support the new No. 4 Ess peripherals. 
Incorporated into this system were the mechanisms required to sup- 
port the introduction of microcomputer-based peripherals such as 
DIF. 


I. INTRODUCTION 


The Digital Interface (DIF) is a newly introduced No. 4 Ess periph- 
eral frame whose functions combine the operations performed by the 
Digroup Terminal (DT) and the Signal Processor 2 (SP2), with the 
exclusion of the supplementary matrix frame which is an spP2 optional 
adjunct.!” 

The main functions of a DIF are to terminate Ds-1 level signals and 
to multiplex them to a form suitable for the No. 4 Ess digital switch; 
to perform the necessary signaling interchange between the transmis- 
sion and switching facilities; and to provide adequate fault detection 
and reconfiguration capability. These functions are identical to those 
performed by the DT/sP2 complex but the DIF achieves them in a more 
compact and modern fashion. The DIF occupies less than one-half the 
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space and uses two-thirds less power than the DT. This reduction was 
achieved in a number of ways: 

(t) by combining the equivalent of four DTs and an sP into a single 
structure thereby eliminating a number of duplicated functions and a 
complex interface, 

(11) by using custom and catalog large-scale integrated circuits 

instead of the 1A Technology previously used, and 

(iit) by utilizing both metal oxide semiconductors and bit-sliced 
microcontrollers to replace the hardwired logic of the DT and sP 
controllers. 

As will be discussed in more detail later, these changes resulted in 
a frame with an entirely different internal philosophy which was more 
flexible and more consistent with the architecture of the other periph- 
eral frames in the No. 4 Ess. Sections II and ITI give a high-level logical 
and physical view of the Dir. The article then describes the Digital 
Interface Unit (DIU) architecture in Section IV, followed by the Digital 
Interface Controller architecture in Section V. Finally, the mainte- 
nance software developed for the DIF is presented in Section VI. 


ll. OVERALL FRAME ARCHITECTURE 


A fully equipped piF consists of a duplex controller (Dic), duplex 
Interface to the Peripheral Unit Bus (IPUB), 32 working pius, and 
two protection spare DIUs as shown in Fig. 1. The Dir connects directly 
to the Peripheral Unit Bus (PUB), interfaces directly or via an echo 
suppressor terminal with the Time Slot Interchange (Ts!) of the No. 4 
ESS network in the ps-120 format, and provides a Ds-1 transmission 
interface with T1 facilities or LT-1 as shown in Fig. 2. Each piu 
terminates five ps-1 signals, giving the DIF a capacity to terminate a 
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Fig. 1—p1F block diagram. 
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Fig. 2—No. 4 Ess block diagram. 





total of 160 Ds-1 signals (3840 trunks). A photograph of the DIF appears 
in Fig. 3. 


lll. PHYSICAL DESIGN 
3.1 Overview 


The circuitry which makes up the DIF is of two types. One type is of 
a transmission nature which is characterized by relatively few func- 
tional blocks arranged in a serial fashion. The other type is of a 
processor nature consisting of many functional blocks which have a 
high degree of interconnectivity typically in the form of large high- 
speed parallel buses. Within the limits of technology used in the DIF, 
circuit pack partitioning of the processor function indicated the need 
for a large Input/Output (I/O) circuit pack capability, whereas the 
transmission type of circuits required significantly less I/O. This 
difference influenced the physical design of the D1U, which is basically 
a transmission function, and that of the pic, which is basically a 
processor or computer type of function. 

The physical design objectives of the DIF included the following: 

(t) Compatibility with the No. 4 Ess environmental, reliability, 

and frame I/O connector requirements. 

(ii) Significant cost, space, and power reduction over the DT/SP2 
complex which it replaces. 

(iii) Circuit partitioning which facilitates interconnection, fault de- 
tection, fault diagnostics, and maintenance. 

(tv) Physical embodiment which enables proper electrical function- 
ing. 


3.2 Device technology 


The piF uses low- and medium-power transistor-transistor logic, 
low- and high-power Schottky TTL, emitter-coupled logic, complemen- 
tary metal-oxide semiconductor, and N-channel metal-oxide semicon- 
ductor integrated circuit technologies. The level or scale of integration 
ranges from small-scale integration to large-scale integration (LSI) with 
a fully equipped DIF using a total of about 18,000 devices of about 200 
codes. 


3.3 Frame and circuit-pack partitioning 


The circuit-pack size chosen for the DIUs was determined by the Ds- 
1 interface, described in Section IV, since this is the dominant entry in 
each DIU (five out of a total of nine circuit packs). Thus, a printed 
wiring board (PwB) which is nominally 8 inches high and 9 inches deep 
with a 114-lead connector was chosen for the DIU circuitry. This size 
made possible the double-sided PWB implementation (see Fig. 4) of the 
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Fig. 3—Photo of DIF. 


Ds-1 interface which is the most cost-sensitive circuit in the DIF 
because of the large number used (170/piIF). The remaining DIU 
circuitry required two double-sided and two four-layer PWB circuit 
packs. The four-layer configuration is capable of accommodating ap- 
proximately 50 percent more circuitry than the double-sided boards. 

The same basic circuit-pack size is also used for the Dic; however, 
because of its large I/O requirements, a 184-lead connector is used 
instead of the 114-lead connector. The Dic consists of 94 circuit packs 
of which 48 are four-layer PWBs and the remaining 46 are double-sided. 

Since all Dius typically communicate with the controller via a bus, 
locating them about a central cable-duct (see Fig. 5) permits the bay 
cabling to be kept to a minimum length with a common access point 
for a pair of Dius which minimizes the number of cable connectors 
required. Those circuit packs that interface with the Dic are located 
near the cable duct. Therefore, the circuit pack positions of the DIU on 
the right side of the duct are a mirror image of those of the DIU on the 
left side. 

The DIF uses the No. 1 Ess framework: one double-bay frame which 
is 7 feet high, 6 feet 6 inches wide, and 12 inches deep; and one single- 
bay frame which is 3 feet, 3 inches wide. Two DIus and their associated 
power units and switches are mounted on an 8-inch-high shelf. At the 
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Fig. 4—Double-sided pws. 


top of bay 0 and bay 2 are the protection switch circuit packs and the 
Ds-1 and ps-120 I/O connectors. These circuit packs are nominally 6 
inches high and 9 inches deep. They contain the relays required for 
transferring the Ds-1 and Ds-120 signals from a DIU to a spare DIU. 
They also contain equalizers for the Ds-1 lines and the transmit and 
receive coax connectors for the Ds-120 signal. Ds-1 connectors are 
located on a panel directly above the protection-switch circuit packs. 
The spare DIU in bay 0 provides backup protection for 15 Dius located 
in bay 0 and the left piu located in bay 1. Likewise, the spare DIU in 
bay 2 provides backup protection for the 15 Dius located in bay 2 and 
the right p1U located in bay 1. 

The duplex PuB interface is located at the top of bay 1 with its 
associated power units as shown in Fig. 5. The circuit packs which 
make up this function are nominally 4 inches high and 9 inches deep 
and use a 92-lead connector. The duplex controller is immediately 
below it and the associated power units are located at the bottom of 
the bay. 

The circuit-pack positions of the PUB interface are not mirror- 
imaged about the cable duct as are those of the DIU and the pic. This 
is for two reasons. First, the PUB interface does not use the cable duct 
for interconnecting to the controller. Second, the PUB is connected via 
cable connectors to the same backplane pins which the circuit packs 
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Fig. 5—piF diagram (PUB interface and power units). 


connect to; thus, to provide a PUB connector arrangement which is 
identical for both puBO and PUB1 to simplify installation, the PUB 
interface 0 and 1 circuit packs must have the same order. This 
arrangement is also used for other No. 4 Ess peripherals. 

The power switches for the PUB interface, the Dic, and the Dius are 
collocated with their respective functions to make their association 
self-evident. 

Heat baffles are used to divert air flow out of the back of the frame 
and to cause aisle air to be taken in from the front of the frame at 
various levels. This prevents the cooling air from becoming excessively 
heated as it rises through the frame. 


3.4 Frame cabling 


Most of the frame cabling is 26 American wire gauge (AWG) twisted- 
pair, 24-conductor flat cable. The wires are on 0.062-inch centers with 
an untwisted section every 18 inches for terminating connectors (see 
Fig. 6). The cable is terminated to connectors which have insulation 
displacing terminals. These connectors plug onto 25-mil square pins 
located in the cable duct area. 

The frame cabling is located on the circuit-pack side of the backplane 





Fig. 6—Frame cabling (flat cabling.). 
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Fig. 7—Frame cabling (cable duct and local cable). 


(see Fig. 7), and wiring to the cable connector pins is done in the 
backplane with surface wiring. This cabling scheme isolates the frame 
cabling from the backplane surface wiring by designating a specific 
area for frame cabling. The cable is routed horizontally between cable 
ducts in designated areas located in front of the frame upright on the 
top surface of heat baffles. 


3.5 Circuit-pack design 


The circuit-pack PWB sizes used in the DIF conform with the 
Bellpac* standard circuit-pack size. The circuit-pack connectors and 
the latch used on these circuit packs are part of the Bellpac hardware 
system.’ Before the design of circuit packs, design standards were 
determined to assure a functional and manufacturable design. These 


* Trademark of Western Electric. 
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Fig. 8—pic circuit pack utilizing appliqued bus bars. 


standards include maximum component heights; component place- 
ment constraints; conductor path grid, sizes, and routing; plated- 
through hole sizes, lands, and placement; connector fanout pattern; 
placement of I/O devices; and use of power/ground decoupling capac- 
itors. 

An appliquéd bus bar is used on some double-sided PWB circuit 
packs for distributing power and ground (see Fig. 8). This makes more 
area available for routing signal paths, since wide printed paths are 
not needed for distributing power to the integrated circuits. 

The four-layer PWB circuit packs consist of a power surface layer, 
two buried signal layers, and a ground surface layer (see Fig. 9). The 
signal layers are buried so that their fine features (metalization and 
clearances as small as 12 mils) are protected, whereas the power/ 
ground layers which consist of relatively conservative features (metal- 
ization and clearances of 25 mils or greater) are placed on the outer 
layers. 


3.6 Backplanes 


The backplanes which interconnect the circuit packs consist of pin- 
populated PpwBs—see Fig. 10. These backplanes use the Bellpac com- 
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pliant pin and are designed to Bellpac requirements.’ Power and 
ground are distributed to the circuit pack positions by printed paths. 
The signal interconnections are in the form of 30 Awa single-ended 
and twisted-pair wire, terminated by wire-wrap connections. 


3.7 Power 


Duplicated 140-volt and 24-volt power feeds are required by the DIF. 
The 140-volt power is used for powering the power units which provide 
+5 volt and +12 volt power for the integrated circuits. The 24-volt 
power is used for the alarm relays, power control circuitry, IPUB 
drivers, protection switch relays, and indicator lights. The pir draws 
8.3 amps from each 140-volt bus and 1.7 amps from each 24-volt bus. 


3.8 Operating environment 


The piF was designed to operate in the No. 4 Ess office environment. 
This required that it be capable of operating over a temperature range 
of 4°C to 38°C and a relative humidity range of 20 percent to 55 
percent, except for a total of 15 days per year during which time it 
must operate from 2°C to 50°C and 20 percent to 80 percent relative 
humidity for no more than 3 days at a time. To achieve these objec- 
tives, hermetically sealed integrated circuit (Ic) packages or beam- 





Fig. 9—pic 4-layer PWB. 
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Fig. 10—piu backplane. 


leaded sealed junction ICs were used with a typical operating ambient 
temperature capability of 90°C. This operating temperature allows for 
some margin since the maximum temperature rise above the aisle 
ambient in the DIF is 30°C. 


3.9 DT/DIF comparison 


The pIF represents a significant size, power, and cost reduction over 
the DT/sp2 complex. One DIF interfaces 160 T1 lines, whereas four DTs 
and one SP2 are required to interface this same number of lines. The 
DIF lineup is 9 feet 9 inches long versus 23 feet 10 inches for the DT/ 
sP2 complex. The power dissipation of the DIF is 2500 watts versus 
7500 watts for the DT/sP2 complex. A DIF is around 50 percent cheaper 
than the DT/sP2 complex. 


IV. DIGITAL INTERFACE UNIT 


The piv consists of the four following major blocks (see Fig. 11): 
(zt) the Ds-1 interface (one per-digroup), 
(ti) the Ds-120 interface, 
(tit) clock generation, (iv) control interface. 
The piu differs from the digroup terminal unit in that a number of 
per-digroup functions which were performed using common control 
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Fig. 11—prvu block diagram. 


techniques are now done on a per-digroup basis. This permits the use 
of custom LSI, which can be shared with other systems and reduces 
the complexity of maintenance as test vector generators and references 
are no longer required. 


4.1 DS-1 interface 


Figure 12 is a detailed diagram of the Ds-1 interface which consists 
of a receive section, a transmit section, and a maintenance and report 
section. 


4.1.1 Digroup receive function 


It is the function of the ps-1 receive circuitry to terminate the T1 
line, to recover line timing and data, and to convert the incoming data 
to a form suitable for multiplexing with other digroups into a Ds-120 
level. 

The conversion from bipolar to unipolar, clock extraction and signal 
regeneration are done with a custom LSI receive converter complemen- 
tary bipolar integrated circuit chip. The output of this chip is a single- 
rail unipolar PCM serial stream and a properly phased 1.544-Mb clock. 
These signals feed the framing and receive logic chip (F/R) and the 
digroup receive chip (RCV LSI) both of which are custom LSI devices. 
The F/R recovers the ps-1 data and signaling framing. To function, 
this chip requires an external line channel counter which is located on 
the Rcv LSI (see Fig. 13). Framing is accomplished by the F/R inter- 
rupting the line channel counter a sufficient number of times to restore 
framing. 

The F/R chip supplies a signal indicating a framing error. This signal 
is used to estimate the error rate and to block the updating of signaling 
information. The chip also generates two signals which mark the A 
and B signaling frame. 

The major per-digroup receive functions are contained in the RCV 
LSI (Fig. 13). The primary function of the RCV LSI is to reduce the 
incoming PCM data 125-ys frame to a specific 23.4-s digroup interval, 
to recover A and B signaling bits, and to compensate for differences in 
line-frame frequency and phase with respect to office-frame frequency 
and phase. Although the Rcv LSI recovers both A and B channels, only 
the A channel is implemented in the present version of the DIF. 

PCM data is converted from serial to parallel (S/P) form and complete 
125-ys frames are stored in an A and B random-access memory (RAM) 
store. (A and B here should not be confused with A and B signaling 
bits.) The stores are alternately written and read so that in general, 
while one store (e.g., A) is being written under line timing, the other 
store (e.g., B) is being read under office timing. If the relationship of 
line frame to office frame allows reading and writing the store in such 
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Fig. 13—Rcv Ls block diagram. 


a manner as to intermix frames, a slip control generates a double A 
read to correct the situation. 

PcM data through the stores is maintained by serial parity over a 
frame of data. Parity generated at the input to the RCV LSI is stored 
and compared with parity generated at the output of the A and B 
stores. In addition to parity, the framing bit, D9, is passed through the 
stores for maintenance purposes. This is possible since the contents of 
the D9 bit are defined by signals from the F/R chip. 

Extraction of signaling information occurs at the output of the A 
and B rams. Signaling information is stored in two 24-bit shift registers 
(signaling) and the data is checked by parity. Data is entered into 
these stores under the command of signals from the F/R chip which 
indicate when to extract signaling information. 

The rcv LsI also contains a detector which determines if pcm bit 2 
is held at zero for all frames in a 32-ms interval. This signal is forwarded 
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to the pic [via the error-source register (ESR) and report (RPT) scan], 
which, in turn, times it for the remote (yellow) alarm. 

A number of functions are combined on the RCV LSI chip to form a 
report function stream: 

(z) line bit D2 stuck at zero, 
(it) out of frame (OOF), 

(tit) framing error (FE), 

(tv) positive slip (positive slip occurs when the system frame exceeds 
the line frame rate), 

(v) negative slip, 

(vit) alarms. 

These reports are used by the pic to determine the status of the 
receive portion of the Ds-1 interface. 

The RCV LSI contains a number of matchers and alarms. Detected 
failures are combined into a single alarm list. To determine if the 
alarms and matches are functioning, exercises can be sent to the RCV 
LSI, which generates alarms without interfering with the processing of 
data. A digroup clear input is used to clear alarms. 


4.1.2 Superframe Pattern Detector 


The Superframe Pattern Detector (SFPD) shown in Fig. 12 maintains 
the integrity of the data flow through the Rcv LsI by checking that the 
D9 bit contains the subframe pattern and that the phasing of this 
pattern corresponds to that derived from the F/R chip. To avoid the 
complication of accounting for slips, framing errors and out-of-frames, 
the SFPD is inhibited during these states. 

The ability of the sFpD to detect superframe pattern errors is tested 
by frame-resident exercise functions (Sections 4.5 and 6.5). 


4.1.3 Digroup transmit functions 


In the transmit direction, the pDs-1 interface receives parallel data, 
plus even parity from the ps-120 interface. The main functions of this 
portion of the interface are to select the data for the appropriate 
digroup, convert it to a 1.544-Mb/s serial stream with the appropriate 
framing information, and to insert signaling information at the appro- 
priate time. The transmit circuitry consists of two major blocks, the 
digroup transmit chip (TmT LSI) and the unipolar to bipolar converter, 
which is comprised of discrete components. The major functions are 
performed in the TmT LsI (see Fig. 14). Although each digroup interface 
receives its data during a different 23.4-us interval, the outgoing Ds-1 
data are frame and framing pattern aligned for maintenance purposes. 

The TMT LsI, like the RCV LSI, contains an address generator, a read- 
write control, and a RAM divided into A and B sections. While a whole 
frame of data is being written into one section under system timing, 
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the other is being read under line timing. Two stores are used for 
convenience as this allows all digroups to be checked with a reference 
pattern at the same time on the T1 line side. The TMT LSI also receives 
signaling information (plus parity) from the control interface (CI) (see 
Fig. 11) for the A signaling channel, which it stores in a 24-bit shift 
register in which it is recirculated until an update interval occurs. A 
superframe pattern generator driven by a synchronizing signal from 
the cl defines the time at which signaling insertion should take place. 
The TMT LSI has the capability of handling both A and B signaling 
storage and insertion, but in the present DIF, only the A channel is 
used. The TMT LSI also has the ability to prevent the insertion of 
signaling data in bit position 8 via a control signal from the c1. When 
signaling is inserted, the parity over data is altered. 
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At the output of the TMT LSI, serial parity over a frame of data is 
compared against stored input serial parity over the same data. In 
addition, the output serial parity over D8 for a frame is compared 
against stored output serial parity over D8 for signaling insertion 
maintenance. The TMT LSI contains a zero code detector and a forced 
D2 inserter. The zero code detector forces bit 7 to a one if all bits of a 
word are zero. The forced D2 inserter under command of the c1 forces 
bit 2 to a zero to transmit the yellow alarm to the distant terminal. 
This action is initiated by a command from the Dic. The forced D2 
feature is maintained by returning a status bit to the ci, while the zero 
code detector is duplicated. 

As in the RCV LSI, the TMT LSI multiplexes all of its status and alarm 
bits into a single stream which is forwarded to the reporting multi- 
plexer. In addition, it has exercises to test the various matches and 
alarms. 

The serial pcm data from the TMT LSI is modified by the unipolar to 
bipolar converter. This block is made of discrete components and puts 
out a bipolar signal of the proper amplitude and shape for driving a T1 
line. 


4.2 DS-120 interface 


The ps-120 interface consists of four functional blocks: the receive 
access, the receiver, the transmit access, and the transmitter (see Fig. 
11). 


4.2.1 Receiver and receive access 


The ps-120 termination contains an analog line receiver similar to 
the one used in the digroup terminal whose function is to terminate 
the coaxial cable, amplify the Pcm data signal from the TsI, and supply 
a sampling clock for the data. The data signal framing is determined 
and the signal is converted from the bit/bit complement serial format 
of the ps-120 link to an 8-bit parallel form with parity. 

The receiver contains a squelch function. If two successive frames 
have pair violations, the squelch is applied. When the receiver recovers 
frame, the squelch hangs over for 8 ms. The squelch applies an all- 
ones code to the outgoing data to prevent analog carrier overload. 

In the receive access, test vectors (derived from unit clock signals) 
are inserted in spare time slots 127 and 0. These vectors cannot 
propagate through the per-digroup equipment. However, time slots 
125, 126, 127, and 0 are forwarded to the transmit access. A two-time 
slot delay occurs between the transmitter and the receiver. Hence, 
when a DIU is looped on itself, data in time slots 127 and 0 are sent to 
1 and 2, 3 and 4 are sent to 5 and 6, and so forth. Thus, when looped, 
the vectors eventually occupy all working time slots and can be used 


DIGITAL INTERFACE 1149 


to test the looped piu. This condition pertains to the spare DIU when 
not in service and for a DIU when it is protection-switched while in an 
out-of-service condition. If a DIU cannot be protection-switched, it is 
possible to notify a distant office of an out-of-service condition under 
most circumstances, by sending an exercise that blocks the transmis- 
sion of the framing signal. Time slots 125 and 126 are looped to permit 
the TSI to test transmission to and from the DIF. 


4.2.2 Transmitter and transmit access 


In the transmit access, the receive and transmit streams are com- 
bined together. To ensure proper operation of the access gates, parity 
over the spare time slots 125 and 126 is even while transmit data parity 
is odd. A failure to multiplex generates a parity failure. The conversion 
of data from a parallel form to the bit-bit prime serial format of the 
ps-120 link is protected by recomputing serial parity, in test vector 
time slots 127 and 0, and comparing it with a reference parity for those 
time slots. 

The transmitter consists of a TsI line driver similar to the type used 
in the pT which amplifies and buffers the Ds-120 signal to drive up to 
1000 ft of 100-2 coaxial cable. 


4.3 Clock selection, generation, and decoding 


To reduce the number of leads between the Dic and the Divs, each 
DIU generates its own clock chains. All DrUs must be frequency- and 
phase-locked to the pic; this is achieved by furnishing to each DIU six 
signals over a duplicated link: 

(t) System signals 
(a) 16.384-Mb/s square wave clock, 
(b) 8-kHz synchronizing signal, 
(c) 31.25-kHz synchronizing signal. 
(iz) Line signals 
(a) 1.544-Mb/s square wave clock, 
(6) 666.67-kHz synchronizing signal for the T1 line superframe 
pattern, 
(c) 8-kHz synchronizing signal for the T1 channel counters. 

In addition to the above signals, two reference signals are sent to 
the prus to ensure that the local clock generation is proper. The first 
is a Signal representing parity over all the system clocks generated in 
the piu having periods from 250 us to 32 ms. The second is a reference 
signal which is compared with superframe patterns generated in the 
digroup transmit chip for maintenance purposes. 

All of the above signals are duplicated and selected by a 1A Processor 
command via the DIc. 
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4.4 Control interface 


Information exchanged between a piu and the Dic is processed by 
the CI. 

From the Dic to each DIu there are four data streams: 

(i) M signaling (A channel outgoing), 
(it) enable signaling (ENSIG), 

(tii) per-channel signaling inhibit (PCSINH), 

(iv) unit maintenance bus (UMB). 

These signals are originated in each of the controller halves, distrib- 
uted to all units serially via balanced drivers on a duplex basis, and 
selectable by a control command via the CI. 

M signaling contains the multiplexed signaling data to be distributed 
to the various DIUs and digroups. Each Diu has a hardwired time of 
unit code (TU), so that a particular DIU can only off-load signaling 
during a unique 125-ps interval every 8 ms. Data integrity is maintained 
by odd parity in time slot 127. The ability of a piu to correctly decode 
the TU is checked every 8 ms by the Dic via the looped spare time 
slots. The cI generates serial parity over each stream and forwards 
them to the five ps-1 interfaces along with the multiplexed signaling 
stream. The per-digroup circuitry picks off its signaling data based on 
a digroup select clock signal. If a parity failure occurs for the incoming 
data, the cI alarms but does not block updating as that would require 
a 128-bit store. Signaling information, however, cannot be updated 
unless the ENSIG is in the proper state. Enable signaling permits the 
pic to block updating of Dius from a faulty Dic half. If the signaling 
failure occurs because of a detected Diu failure, the unit is protection- 
switched. 

Per-channel signaling inhibit allows a piu to block the insertion of 
signaling information into the ps-1 stream for any channel in that Div. 
The cr contains a 128-bit store which holds the PcsINH status. This 
signal permits the piu to pass full 8-bit information for common- 
channel interoffice signaling (ccIs) or for permanently connected spe- 
cial service trunks. The cr loops and multiplexes the received spare 
PCSINH time slot data with the M signaling spare time slot data so that 
the DIU can maintain the distribution to the piu. Distribution is 
checked every 16 ms by the pic. Data to the DiUs are protected by odd 
parity in time slot 127. The ci distributes PCSINH on a looped basis to 
all digroups and the parity over the data returned to the CI is compared 
against the parity received from the DIc. 

From each DIv to the Dic two streams of data exist: 


(t) multiplexed report stream, 
(ui) E signaling stream (A channel incoming). 
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The report stream is composed of the per digroup reports (24 bits per 
digroup) as shown in Fig. 15. Thus, the report stream is composed of 
120 bits for each piu. The various per-digroup report streams are 
multiplexed in the piu backplane and this multiplexed stream is then 
combined with the 8 common alarm bits in the c1. The Div repeats the 
reports every 125 ys. The common control reports are autonomously 
cleared every 32 ms, whereas the per digroup reports are cleared via a 
command from the pic. The autonomous clear for each DIU occurs at 
staggered times so that the pIc can interrogate the Dius without 
danger of missing a failure report because of a DIU self-clearing action. 

The multiplexed report stream which is transmitted in simplex form 
to the pic from each DIU is driven from an unbalanced source and its 
ability to transmit data is maintained by frequent controller exercises. 

The signaling stream is comprised of five multiplexed digroup E 
signaling channels and looped channels. Seven spare signaling channels 
(121 to 127) received by the ci are looped back to the pic via the A 
signaling stream with a 125-us delay. The Dic will only look at the 
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Fig. 15—DIU multiplexed report stream. 
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looped data during its normal E scan once every 8 ms. Reception of M 
signaling and PCSINH is tested by verifying that the looped channels 
contain these signals during alternate 8-ms scans. The signaling stream, 
like the report stream, is simplex from each DIU and driven by an 
unbalanced source. 

The uMB is the communication link from the pic to the pius. It is 
via this stream that the Dic can exercise the DiUs and insert the remote 
alarms. The uMB line format is shown in Fig. 16. Data on the mainte- 
nance bus is frame synchronized to piu clocks but is not distributed 
by decoding a TU. Instead, a piu decodes six address bits, U0 to U5, to 
determine if the data is for it. A seventh bit, ALLUN, is used to address 
all prus simultaneously. 

The function enable determines the action to be taken by a piv. If 
the action to be taken is for a specific digroup, then the digroup 
identity will be flagged. The exercise fields are not coded, i.e., each bit 
of an exercise word is a specific common or digroup exercise state. 

The UMB distribution to the pius is maintained by odd parity over 
all data in Ts 127 within the Dic. 


4.5 Exercises 


The piu contains a more comprehensive list of exercise routines 
than the digroup terminal unit does. Normally, exercises are used to 
test on a periodic basis the error source registers and matchers in the 
piu. However, in the piu a set of exercises tests the per-digroup 
equipment when the DIU is protection-switched. More specifically, the 
Dic has a set of diagnostic routines that tests for the ability to: 


(t) detect framing errors, 
(iz) reframe, 
(iz) detect forced D2, 
(tv) send forced D2, 
(v) transmit and receive signaling. 


For items (iz) and (viz), time limits are set on the response of the DIU 
so that deterioration of the F/R logic to respond properly can be 
detected. 


V. DIGITAL INTERFACE CONTROLLER 
5.1 Design objectives 


The pic represents the latest development in the evolution of the 
No. 4 Ess peripheral frame controllers.* Its design draws heavily on 
experience accrued over earlier transmission/switching designs, and 
capitalizes on microprocessor technology to allow flexibility to respond 
to change. 
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Functionally, the Dic supports the DIUs in much the same way that 
the digroup terminal controllers and the related portions of the Signal 
Processor 2 supported the digroup terminal units. It provides for the 
collection, processing and distribution of supervisory and address 
signaling information, a reliable clock source, and an interface to the 
PUB of No. 4 ESS. 

The pic provides reconfiguration of the D1Us under fault conditions 
by controlling the protection switching of the spare units and appro- 
priately redirecting signaling interchanges. The transmission facility 
maintenance and DIU maintenance are provided via a maintenance 
microcomputer developed in the controllers, around the Bellmac*-8 
microprocessor. 

Many software considerations went into the design of the pic. The 
repertoire of operational (call processing related) peripheral bus orders 
was chosen to be compatible with those of the Signal Processor 2. This 
minimized the associated call processing software development. The 
maintenance software in the 45 generic was dramatically restructured 
to allow for “intelligent” controllers and to minimize and modularize 
hardware dependent software. The Dic was designed to minimize 
generic DIU reconfiguration software, and controlled most DIU recon- 
figuration actions with frame resident “firmware.” 

The pic was designed so that a significant portion of the controller 
hardware and diagnostic software could be used in common with the 
Peripheral Unit Controller (PUC) associated with the mass announce- 
ment system feature in 4E5.° This minimized overall development 
effort, and allowed the Pc to capitalize on the larger scale of manufac- 
ture of the pic. The maintenance software development for the Pc was 
also facilitated, in that common hardware characteristics minimized 
differences in the frame-dependent software. 


5.2 Controller architecture 


Figure 17 shows the overall architecture of the Dic. Since a failure 
of a Dic could affect up to 3840 trunks, the controller is fully duplicated. 
Hither controller can support all the pius, while its mate is being 
diagnosed and repaired. The two controllers are independently pow- 
ered and are provided independent clock inputs (Master Timing Links 
or MTLS) from the associated TsIs or echo suppressor terminals. Two 
MTLS are provided per controller. 

The controllers derive their internal timing from either connecting 
MTL, and each controller has its own independent countdown chain. 
Synchronization signals are provided by both controllers to the pius 
which select one and only one controller’s timing to drive the internal 


* Trademark of Western Electric. 
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Fig. 17—Duplex pic overview. 


piu functions. The ten signals provided to the p1us allow their ps-120 
outputs to be synchronous to the No. 4 Ess time division network and 
also determine the outgoing T1 line frequency of 1.544 Mb/s. 

The PuB interface, which is also duplicated, terminates in the bus 
access circuits as shown in Fig. 17. This interface is fully configurable 
so that either controller can take its input from either bus, and can 
reply on either or both buses. In addition to the puB, the 1A Processor 
is able to access the DICs via pulse points. These pulse points are 
employed for recovery actions, such as putting a controller in the 
maintenance state, selecting clocks, or disabling a controller’s protec- 
tion-switch capability. The pulse points are provided via independent 
signal processors for each controller. 

The maintenance and operational data which pass between the DIcs 
and DIUs are time-division multiplexed. Consequently, a transmission 
path is not required from each Diu for each signal. This results in 
simplified intraframe cabling and minimal select circuitry in the DIus. 


5.3 Interface to the peripheral unit bus 


The Dic terminates the entire PUB. The bus consists of four dupli- 
cated bus groups: 
@ The pu enable/address bus, which conveys address information that 
determines which frame in the No. 4 periphery should respond to a 
particular peripheral order. 
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© The PU write bus (PUWB), which conveys the data to be accepted by 
the pic and processed. 
© The pu reply bus (PURB), which is used to return data to the 1A 

Processor. 

@ The pu control bus over which control and maintenance information 
is transmitted to and from the peripherals. 

The bus access circuitry terminating PUBO is independently powered 
from that terminating PpUBI, and both are powered independently of 
either controller. In this way, a failure of either bus access circuit or 
either controller does not affect the balance of the controller and bus 
circuitry. 

As part of the bus access circuitry, “bus clamps” are provided to 
prevent a faulty controller from “babbling” onto either bus. These 
clamps are controlled with a combination of the maintenance access 
pulse point and either the member interrogate or group interrogate 
bits of the control bus. When a bus or controller is powered down, the 
clamps are manipulated by power sequencing logic to prevent babbling. 

The PUB signals incoming to the pic are terminated in series receiv- 
ers before passing on to the next peripheral frame. So that replacement 
of a receive pack does not interrupt the continuity of these signals, 
bypass resistors are provided. If the DIF is the last peripheral on the 
bus, optional terminating networks are employed. 


5.4 Simplex controller architecture 

Figure 18 shows the internal architecture of one of the duplicated 
controller circuits, commonly referred to as a simplex controller. PUB 
orders coming to the Dic are stored in the receive logic, and tested for 
validity. If the order is directed to this particular DIF and obeys the 
appropriate protocol, the receive logic interrupts the Executive Con- 
troller (EXEC). The EXEc routes the order to the appropriate function 
via the internal bus. If the order requires a reply, the data from the 
subject function is routed through the internal bus and into the reply 
logic. The reply logic then controls the appearance of the reply data 
on the PURB. 

The function of the receive logic is to store the bits received from 
the PUB and test the received information for the format and address- 
ing. In addition, it provides synchronization between the asynchronous 
PUWB data and the synchronous interface of the EXEc controller. The 
reply logic provides the necessary reply formating of the information 
returned to the 1A Processor via the PURB. There are a number of 
special reply bit fields (e.g., maintenance data) that are required of the 
pics. The receive and reply logic provides the control signals to handle 
these special requests from the 1A Processor. 

The internal bus structure of the pic consists of a multiplexed bus. 
This bus structure was chosen to minimize fault susceptibility, to 


DIGITAL INTERFACE 1157 


TO/FROM DIUs 


mM PCSI CE Uo U31 SPO SP1 


ERROR AND 
REPORT 
MESSAGE FUNCTION 





FORMATTER COLLECTION 
UNIT SIGNALING AND 
PROCESSOR PREPROCESSING 


PULSE STATUS 
BOGE AND | |cesr MAINTENANCE | |PROTECTION 
ACCESS Peace: MICROCOMPUTER | | SWITCH CLOCKS 


EXTENDED MTLs 
1B FROM TDN 
(PC ONLY) INTERNAL BUS STRUCTURE 


RCV LOGIC EXEC - RPY LOGIC 





PUWB O 
PURB O 
PURB 1 


TO/FROM BUS A INTERFACE 
Fig. 18—Simplex pic. 


alleviate the necessity for multiple high-powered drivers, and to allow 
better control of the bus topology. All internal bus lines are looped 
back to the internal bus circuit pack driving them, for impedance 
termination and maintenance. The internal bus, which is under control 
of the EXEC, selects one of 16 possible ports or data sources to be 
placed onto the internal bus. Each port consists of 24 data bits and a 
parity bit. Each functional entity within the Dic is considered a port 
on the internal bus. The receive and reply logic is treated the same as 
the other functional entities within the pic. This parallel high-speed 
structure allows flexible communication within the DIc. 

The EXEC configures to the internal bus multiplexers with four bits 
of select information to specify which data of the 16 ports it will route 
through the internal bus. These four internal bus select leads are 
duplicated for maintenance. The EXEc also sends to the functions 
within the DIF six bits of source address, six bits of destination address, 
four bits of internal bus operation code, and an internal bus load pulse. 
The source address specifies which function will be placing the data 
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onto the internal bus. This allows for premultiplexing, where necessary, 
in the various functions. The six-bit destination address specifies the 
recipient of the data. The function addressed loads its internal register 
with the data on the internal bus upon receiving the load pulse. Each 
function also returns to the EXEC a source and destination acknowl- 
edge. This is an acknowledgment of a function’s receipt of a valid 
source or destination for one of its registers. This acknowledge allows 
the EXEC to check the source and destination decoders in the various 
DIF functions. 

The controller handles all internal communications in the Dic and 
communication with the 1A Processor. The EXEC is a bit-sliced, bipolar 
microprocessor. The microstore is modular with a maximum capacity 
of 4000 words of microprogram. The EXEC contains a 16-level priority 
interrupt control circuit, a 12-bit microsequencer, and an 8-bit arith- 
metic logic unit (ALU). Internal bus access logic allows the ALU to 
access 24-bit data from various Dic functions. Inputs to the EXEC 
consist of interrupt signals from various functions within the DIc. 
These signals indicate a request for a particular task routine to be 
executed. 

The microprogram of the EXEC consists primarily of an interrupt 
handler and numerous special purpose task routines to control internal 
bus transactions. Interrupts from the receive logic (i.e., from the 1A 
Processor), from the maintenance microcomputer, from the unit sig- 
naling processor, and from a real-time clock (10 ms) source can all 
initiate such task routines. As an example of the microprogram exe- 
cution, consider the receipt of an interrupt signal from the receive 
logic. The execution of an EXEC order can be broken down into three 
parts. The first of these three is the time from the presentation of the 
interrupt source until the interrupt is checked and detected. This is a 
noninterruptible state and consists of tasks that cannot be subdivided 
or “segmented.” An example of a noninterruptible task is a bus transfer 
with an internal bus client with volatile data. 

The second part of EXEC order execution is the time to transfer 
control to the interrupting routine. This is the time for the interrupt 
to be acknowledged, for control to be passed through the interrupt 
jump vector table, and for the resetting of the interrupt circuit. The 
third portion consists of the actual execution time of the requested 
function. The EXEC must complete the data transaction within a finite 
time window or the receive logic triggers a controller alarm. This is 
done as a sanity check against the EXEC. 

In addition to interrupt initiated task routines, the EXEC contains 
microdiagnostics which are invoked by 1A Processor resident diagnos- 
tic programs. These microdiagnostics extensively check the ALUs, 
microsequence controller, and microprogram store of the EXEC. 
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The overall configuration of the Dic is determined by the data in the 
controller status register. Routing of puB data, MTL selection, and 
controller selection are determined by the state of this register. The 
exercise register is used during diagnostics to exercise the fault-detec- 
tion circuitry of the pic and verify their ability to report. Under normal 
conditions, all hardware fault detector outputs are stored in the con- 
troller’s error-source registers. Each major functional element in the 
controller has a local error-source register, and a summary of these are 
recorded in the primary controller error-source register (CESR). This 
register is interrogated by maintenance software during a fault condi- 
tion to isolate the fault to a particular controller. Unlike some of its 
predecessors, a minimum of fault detection in the Dic relies upon cross 
controller matching. Many processing elements are duplicated within 
a simplex controller to assure autonomous maintenance even under 
simplex operation. 

The disposition of the spare DIUs is controlled by the protection 
switch registers. The protection switch register outputs of the two 
controllers are logically oRed in the protection-switch equipment. 
Special cutoffs are provided to prevent a faulty controller from affect- 
ing a protection switch. The power sequencing circuitry activates these 
cutoffs (in addition to software control) when a controller is powered 
down. 

The operational functions of E signaling reception, M signaling 
distribution, and PCSINH are controlled by the unit signaling processor. 
The unit signaling processor function includes the collection and 
distribution of supervisory signaling, as well as dial pulse reception and 
outpulsing. The signaling processing is determined by state translation 
firmware, and therefore can be modified to respond to changes in pulse 
width requirements or general timing changes. The signaling on all 
trunks is processed every 10 ms, as initiated by the EXEC in response 
to the real-time clock interrupt. 

The results of signaling processing are reports which must be com- 
municated to the 1A Processor. The sp deposits reports via the internal 
bus in one of four “buffers” or scratchpad RAM regions. These buffers 
are designated high priority, low priority, seizure, and digit buffers. 
The EXEC routes the data into the buffers, and administrates the 
appropriate read and write pointers. 

The 1A Processor periodically polls the pir, just like an SP2, to 
determine if any reports are present. If reports are present, the DIF 
acknowledges the poll, and call-processing software reads the buffers 
and either directs the unit signaling processor to continue processing 
that trunk, or proceeds to connect a path through the No. 4 Ess time- 
division network. 

The pic has a maintenance microcomputer (MMC) to maintain the 
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DIus, aid in fault-recovery actions, or participate in diagnostics of other 
portions of the controller. The MMc has a byte-addressable serial bus 
with which it sends information to the pius. The DIU report streams 
are converted to parallel format and deposited in the MMC RAM via 
direct memory access (DMA). Following a complete DMA cycle, which 
occurs once every 32 ms, the preprocessing hardware generates an 
interrupt which initiates processing of the piu alarm data. In this way, 
DIU common alarms are “hit timed,” local and remote (T1) alarm 
indications are detected, and facility reports such as slip, out of frames, 
and error rates are prepared. 

The MMC is built around a Bellmac-8 central processing unit (CPU) 
which is duplicated within a simplex controller for fault detection. The 
MMC executes tasks under direction of an interrupt-driven operating 
system called “os8” written in the C programming language. All of the 
MMC resident application programs are written in C as well, with the 
exception of real-time intensive unit alarm processing tasks which are 
programmed in assembly language. 

The 1A Processor can access any memory location in the MMC via 
DMA. Autonomous reports from the MMC are deposited in a portion of - 
buffer RAM designated the maintenance buffer. Like the operational 
buffers, this is administrated by the EXxEc and periodically polled by 
the 1A Processor. 

The mMMc is capable of (macro) peripheral order expansion, under 
direction of the 1A Processor. In this way, complex tasks such as 
initialization and unit reconfiguration are relegated to the MMC and 
the 1A Processor maintenance software is considerably simplified. 
This was consistent with the restructuring of maintenance software, a 
significant part of the 4E5 generic. 

Each simplex pic independently derives clocks that drive the con- 
troller’s circuitry, and which are sent to the pius. The clocks are 
derived from the MTLs originating in the time-division network (TDN) 
of No. 4 sss. In this way the pDs-120 signals of the pDius are fully 
synchronous with the timing of the associated TsI. 

To preserve commonality between the pic and the Pc, two major 
features were a part of the design. First, a port of the internal bus was 
equipped with bidirectional drivers/receivers to form the extended 
internal bus. Since this extended bus is tristate, and any client output 
shorting low would affect all clients, special extended bus cutoff signals 
are provided. These can be used to quarantine any extended bus client 
and allow the Pc to continue operations. 

The rc does not have DIUus, so all DIu-related hardware was packaged 
in the lower shelf of the pic. By deleting this shelf and the associated 
circuit packs, the pc did not bear this unnecessary cost burden and 80 
percent of the Dic hardware is still applicable to the Pc. 
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VI. MAINTENANCE AND RECOVERY STRATEGY 
6.1 Overview 


The maintenance and recovery strategy employed by No. 4 Ess is 
structured into five levels. These levels are phase level, interrupt level, 
interject level, base level maintenance, and diagnostic isolation. Of 
these levels, the first four are related to identifying and isolating the 
failing subunit and recovering the system. The fifth level attempts to 
isolate the failures within a frame and is used to maintain the opera- 
tional soundness of the frame. As the service impact of a failure 
increases, the level of response to that failure is altered. Within this 
system, each frame can specify the initial level of recovery to be 
associated with a failure. 

With the introduction of the pir, an additional level of maintenance 
was added. The new level, routine exercise, is used to detect faulty 
pius before they have been able to adversely affect the service they 
provide. In the next few sections we present a description of the 
maintenance and recovery strategies specifically used to support the 
piIF. Table I shows the maintenance and recovery strategies to be 
covered. 


6.2 Phase level 


Phase level is the severest of the recovery strategies, in that service 
interruption may be experienced during its execution. Phases are 
stimulated when a failure seriously hinders the effective operation of 
the system or if reinitialization of part or all of the system is required 
for recovery. Phase level is further segmented into four levels. The 
most drastic, level 4 (phase 4), is the level of phase recovery we will 
concentrate upon. 


Table |—Maintenance and recovery 
software hierarchy 


Phase Level 
System Failures 
Action—reinitialize system 

Interrupt Level 
Controller (Dic) errors 
Action—DIc configuration 

Interject Level 
Unit (DIU) errors 
Action—DIv configuration 

Base Level 


Normal operating level 
Action 

® Base level maintenance 
@ Diagnostics 
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A phase 4 cleanly reinitializes the entire No. 4 Ess office, leaving 
failing frames out-of-service. In previous generic issues, frame initiali- 
zation during level 4 executed in a strictly serial manner and resulted 
in simplex operation. In the case of the DIF, a new approach to level 4 
initialization was implemented. The DiFs within an office are initialized 
to duplex operation with the initialization of all pirs being done in 
parallel. 

To a large degree, this change in initialization design can be credited 
to the incorporation of a maintenance microcomputer (MMC) within 
the pir. The Mc is responsible for the bulk of the frame initialization 
function. The MMc accepts orders from the 1A Processor, which it 
expands and executes in addition to executing its internally generated 
maintenance actions. The internal expansion of 1A Processor orders 
by the mmc allows the 1A Processor to issue a command to the DIF 
and have an entire function executed, while the 1A Processor does 
something else. Thus, during phase 4, a small amount of configuration 
is performed on the first pir, followed by the issuance of a frame 
initialization order. As the frame “init” order is processed by the first 
DIF’s MMC, the 1A Processor directs its attention to the next DIF in the 
office. This sequence is repeated until all piFs have been initialized. 
Upon the completion of the frame “init” order, the MMc returns a 
success or failure indication. The 1A Processor uses this information 
to determine the resultant configuration of each piF (duplex, simplex, 
or duplex failed). The final result of the phase 4 is a stable operating 
office. 


6.3 Interrupt level 


In developing the DIF, it was concluded that errors within the DIc 
should be reported separately from those errors associated with DIUs 
to simplify the resolution of faults. It was also decided that Dic errors 
should be reported at a higher level to prevent the possible loss of all 
the Dius in the alarming frame. The F-level interrupt is the mechanism 
used by the piF to notify the 1A Processor of failures associated 
specifically with its controller circuitry. It is the initial level of recovery 
associated with a DIF controller failure. 

Let us suppose that a No. 4 Ess office is operating stably. Further, 
suppose that a Dic in that office experiences a problem. This problem 
may be due to a hard failure, the loss of a circuit pack, or it may be 
caused by a transient failure condition. In any case, the operation of 
the pir has been disrupted and needs to be corrected. An interrupt (F- 
level) is the mechanism used to inform the 1A Processor of the problem 
and request action to resolve the failure. 

A failure detected within a DIC results in the setting of an ESR bit(s) 
associated with the failed circuitry. The setting of this bit(s) stimulates 
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the F-level interrupt. The recognition of this interrupt at the 1A 
Processor invokes the DIF interrupt recovery package, DIFRINTR. It is 
the responsibility of DIFRINTR to determine the appropriate frame 
recovery action based upon the error-source signature, the initial 
configuration of the frame, and the relative frequency of reported ESR 
bits from the frame. 

The actions available to DIFRINTR are basically three. DIFRINTR can 
decide that the appropriate recovery action is the restoral of the 
alarming controller. This may be dictated by the transient nature of 
the failure or the simplex configuration of the frame. DIFRINTR can 
request the listen-only removal of the alarming controller followed by 
a diagnostic. The majority of the failures occurring in duplex frames 
are handled using this option. Removing a controller to listen-only 
allows it to be isolated from its mate, yet kept entirely up to date until 
the diagnostic begins. The listen-only removal request allows DIFRINTR 
a second chance if the wrong decision as to which controller to remove 
was made. The third option available to DIFRINTR is the zero-start 
restoral of the mate. This option is executed as a final attempt to 
preserve the operation of the frame. It terminates all stable and 
transient cells that are being handled by the alarming pir. The 
alarming controller is removed from service while the mate controller 
is completely initialized. Failure of this option results in the duplex 
failure of the alarming DIF. 


6.4 Interject level and base level maintenance 


Both the interject and base level maintenance (BLM) reporting 
mechanisms are additional methods used to inform the 1A Processor 
of a problem and request action to resolve the failure. The particular 
levels of recovery in relation to the DIF are strictly reserved for DIU- 
associated failures. Within the pir, these failures are monitored and 
reported by internal frame processes. 

The maintenance of each DIU is performed internal to the DIF frame 
within its MMc complex. On a 32-ms cycle, data associated with the 
performance and health of each piu in the frame is written into the 
MMC memory spectrum. The MMC real-time DIU maintenance firmware 
scans this data in search of errors and reports both common alarms 
and digroup alarms to the 1A Processor. 

Common alarms indicate a malfunction affecting the operation of 
120 trunks, an entire piu. To preserve the operation of this alarming 
subunit, these failures need to be detected quickly. For this reason, 
they are scanned for every 32 ms. The discovery of a common alarm 
places that piu on the hit timing list. If the alarm exists for three 
consecutive scans (96 ms) of the piU data, it is classified as a hard 
failure and reported to the 1A Processor. Common alarms are reported 
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via the Autonomous Peripheral Unit Trouble bit, in the frame’s 
primary ESR. This causes interject processing to be scheduled by the 
1A Processor. Reporting common alarms via an interject is rapid 
enough (served within 20 ms) to allow the recovery software the option 
of protection switching a spare Diu for the alarming DIU. Protection- 
switching preserves stable calls active in the alarming DIU. 

Digroup alarms are less critical with respect to trunk effect than 
common alarms. These failures affect the operation of only a single 
digroup (24 trunks) within a single piu. These indicators are scanned 
once every 384 ms. The lower-scanning frequency for these failures 
results from the requirement that all pius be checked for common 
alarms every 32-ms cycle. The time remaining in each cycle after this 
checking permits the scanning of three Dius for digroup failures. Even 
with the longer scan time, digroup failures are hit timed for three 
counts before being labeled hard. Hard failures of the digroup variety 
are reported via a BLM. A BLM is less system-affecting than an interject 
and is the result of a failure report being retrieved from the mmc 
maintenance buffer on base level. The BLM removes the failing digroup 
from operation. 


6.5 Diagnostics 


Once the recovery software (outlined above) has isolated an alarming 
DIC or DIU, the piF diagnostic is called upon to further resolve the 
failures. It is the responsibility of the diagnostic to specifically deter- 
mine which circuitry, if any, has failed. Toward this end, the DIF 
diagnostics are structured into phases. Each phase is responsible for 
determining the operational fitness of a specific section of the frame’s 
circuitry. Thus, the operational fitness of the entire frame is based on 
an all-tests passed condition being achieved by all phases of the DIF 
diagnostic. A diagnostic phase should not be confused with the system 
phases discussed earlier. 

The piF controller diagnostic; for example, consists of 23 distinct 
phases. The phases of the piF diagnostic are ordered such that the 
fitness of the frame is checked using an “onion peeling” philosophy. In 
peeling an onion you start at the outside and work your way inward. 
The pir diagnostics are implemented in much the same way. The 
early phases of the pic diagnostic start at the PUB and clock interfaces 
to the frame and progress inward. The early phases confirm the 1A 
Processor’s ability to gain access to the frame. Later phases attempt to 
verify the functional integrity of the executive processor and check the 
accessibility and operation of the Dic’s internal bus. At this point in 
the diagnostic, the front end and the bus to the workings of the 
controller have been verified. The remaining phases of the diagnostic 
determine the soundness of the maintenance microcomputer complex 
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and the signal processor complex which are accessible via the internal 
bus. 

The end result of a diagnostic analysis of the controller or subunit 
circuitry will be the isolation of the circuit pack(s) that have failed. 
Upon replacement of the faulty equipment, the diagnostic is executed 
again to ensure that the problem has been corrected before returning 
the DIC or DIU to service. 


6.6 Routine exercise 


In addition to the other functions detailed previously, the inclusion 
of the MMC complex has afforded the piF the capability of executing 
tasks (functions) on a routine basis. When the MMc has no requested 
tasks to execute, it sequentially executes the tasks that have been 
identified as routine. To date, two tasks, DIU exercising and auditing 
of the protection-switch state, are run routinely within the DIF. 

The DIU exercise routine task uses test vectors to functionally test 
the operation of each piv. If a DIU does not respond properly to the 
test sequence, the failure is reported via BLM. The protection-switch 
state audit routine task checks for the proper configuration of the 
spare units. A failure of the audit results in an interject. 

The routine exercise capability allows the pir the opportunity to 
functionally verify the operation of critical hardware segments rou- 
tinely. This constant monitoring provides early error-detection and 
correction. 


Vil. CONCLUSION 


The development of the DIF was the result of the concerted effort of 
the components, switching, and transmission organizations in Bell 
Laboratories and of the components and system organizations in 
Western Electric. A comparison of the DIF with the DT/sP2, which it 
replaces, indicates a significant reduction in cost, power, and space 
requirements, an improvement in reliability, and a reduction in the 
installation effort. By the end of 1980, it is expected that over 500,000 
No. 4 Ess terminations on DIF will have been deployed. 
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This paper examines two different examples of how No. 4 Ess 
software has evolved through restructuring to meet the needs of the 
changing No. 4 Ess environment. The two software areas that under- 
went varying degrees of incremental restructure are Call Processing 
and Fault Recovery. We characterize the pre-restructure architec- 
tures, discuss the motivation and rationale which led to restructure, 
and present and evaluate the post-restructure architecture for each 
software system. 


|. INTRODUCTION 


No. 4 Ess, the largest-capacity electronic switching system ever 
developed by the Bell System, was developed to meet specific objec- 
tives of capacity and reliability.’ To meet these objectives, the No. 4 
ESS was designed using new hardware technology and a comprehensive 
stored program. The primary objectives of the initial No. 4 Ess program 
design were: 

(t) real-time efficiency, 
(tz) simple human interface, 

(zz) defensive design, 

(tv) ease of modification. 

Since its initial service date, the No. 4 Ess has released a new generic 
software package approximately once a year incorporating major new 
hardware and software capabilities. Each new generic was built upon 
the previous generic. As the number of features provided by the No. 
4 ESS grew, it became increasingly more involved in certain areas of 
software to accommodate new features without impacting existing 
features. The amount of time spent in regression testing had the 
potential for becoming an ever growing part of the software develop- 
ment interval, thus, increasing new feature development cost. 

New software development methodologies that used top-down de- 
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sign and structured programming techniques gained wider use in the 
No. 4 Ess software development process. These rigorous approaches 
to software design effectively pointed out where certain areas of No. 4 
ESS software could be improved. 

A new high-level programming language, EPLX, was introduced that 
supported structured programming techniques and provided increased 
program readability, modularity, and maintainability. 

Given the continuing demand for new features, the design objectives 
for software development had to be enhanced to place greater emphasis 
on ease of modification and flexibility to reduce the cost and develop- 
ment time of new system features. This increased emphasis along with 
new software development methodologies and programming languages 
led to a selective restructuring of areas of No. 4 Ess software, which 
were to be affected the most by new feature development. 

The No. 4 Ess software areas which became major candidates for 
restructuring were call processing and fault recovery. Sections II and 
III give the restructuring process for these two software areas. 


il. CALL PROCESSING RESTRUCTURING 


To better understand the motivations and rationale for restructur- 
ing, we review the call processing architecture prior to restructure.” It 
should be made clear that the entire call processing system was not 
restructured. Instead, an incremental restructuring occurred which 
focused primarily on the task programs responsible for call handling 
actions. We discuss the task programs in light of their original design 
and their deficiencies. We give the motivation for and approach to 
restructure, along with a discussion and evaluation of the new archi- 
tecture. 


2.1 Call Processing before restructure 


When the No. 4 Ess cutover in 1976, it provided the capability to 
interface with both local and toll switching machines, to function as a 
tandem and/or toll switch, and to interface with all of the trunk 
signaling types listed below: 

(t) Dial Pulse (DP) 
(a) Delay Dial Start Dial 
(6) Immediate Start 
(c) Wink Start 

(tt) Multifrequency (MF) 
(a) Wink Start 
(b) Delay Dial Start Dial 

(ti) Common Channel Interoffice Signaling (ccts) 

The No. 4 Ess also provided the Centralized Automatic Message 
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Accounting (CAMA) function for trunks using dial pulse or MF address 
signaling. 

The call processing programs were initially structured in a three- 
level hierarchy as shown in Fig. 1. The task dispensers (Level 1), which 
were entered directly from Executive Control, provided the interface 
for external stimuli received from the signaling hardware (Signal 
Processors and CCIS terminals) and the interface for internal stimuli 
received from timing and queuing programs. Executive Control pro- 
vided both high- and low-priority entries to the task dispensers, which 
used the entries to poll the buffers in the signaling hardware for high- 
and low-priority reports and to determine if time-out conditions ex- 
isted. If reports or time-out conditions existed, they were dispensed 
’ sequentially to the appropriate task program for processing. The task 
dispensers remained in control until all relevant external or internal 
stimuli had been processed or until an overload threshold had been 
reached that limited the amount of activity processed by the system 
during any base cycle. 

The task programs (Level 2) performed the specific actions that 
switched calls. Task programs were entered from the task dispensers 
in response to a particular stimulus. The task program investigated 
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Fig. 1—Initial No. 4 Ess call processing architecture. 
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the present state of the call and, depending upon the present state and 
the stimulus, initiated the appropriate actions to advance the call to a 
new state. The present state of a call can be determined from the call 
register (CR) or trunk register (TR). The cR is a 64-word block of call 
store memory used for temporary storage of information during call 
setup. CRs are not dedicated on a per-trunk basis. Instead, there is an 
engineered number of cRs per office which are link-listed together. 
The TRs are two-word blocks of call store memory assigned on a per- 
trunk basis. TRs contain dynamic information about the current state 
of the trunk or call. 

Certain repetitive or specialized call handling functions were de- 
signed as subroutines (Level 3) so they could be accessed by several 
task programs. Examples of call handling subroutines are seizing and 
initializing a CR, connecting incoming and outgoing trunks, hunting a 
service circuit, or pegging a traffic counter. 

The task programs also interfaced with other operational programs 
during the processing of a call. These interfaces were established to 
allow independent software development of major operational func- 
tions such as audits, translations, network management, and trunk 
maintenance. Where these functions overlapped during the processing 
of a call, clearly defined interfaces were established with the task 
programs. 


2.2 The Call Handling task programs 


The Task Program block in Fig. 1 shows the set of task programs as 
shown in Fig. 2. The task programs were organized on a signaling- 
type/type-of-trunk basis and separated into incoming trunk and out- 
going trunk programs. Each program was state driven and was respon- 
sible for acting on the stimuli dispensed from the task dispensers. The 
incoming trunk programs processed internal and external stimuli as- 
sociated with the incoming trunk part of a call, and the outgoing trunk 
programs processed internal and external stimuli associated with the 
outgoing trunk part of a call. Internal stimuli were associated with 
events such as timing or queuing reports. External stimuli were phys- 
ical trunk signals. Fach incoming trunk program also had an interface 
with the Digit Reception and Analysis Programs, which were respon- 
sible for determining the outgoing routes for the call based upon the 
dialed digits. Digit sending to a large degree was part of the outgoing 
trunk programs. 

The task program architecture arose primarily because of the means 
of communication with the signaling hardware. Communication with 
the signaling hardware was at a physical signal level (off-hook, on- 
hook) rather than a logical signal level (seizure, answer). Therefore, 
the signaling protocol for a specific trunk was required very early in 
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Fig. 2—Detailed view of task programs. 


the processing of trunk signal reports. Rather than convert the physical 
signal into a logical signal prior to dispensing reports to task programs, 
the task programs were designed to handle the physical signals on a 
signaling-type/type-of-trunk basis. This approach resulted in a set of 
programs that were organized on an incoming and outgoing trunk basis 
for each signaling type/type of trunk. 

Each program processed stimuli associated with its particular sig- 
naling type. However, there were always points in call setup where an 
incoming and outgoing trunk were involved in the call. They could be 
of the same or different signaling types. A stimuli at these stages of a 
call usually required actions by both the incoming and outgoing trunk 
programs. The design approach was to take one of two actions: (z) do 
whatever processing is required by the incoming trunk program, then 
pass control to the outgoing trunk program or vice versa, (ii) have the 
incoming or outgoing trunk program process the signal for both trunks 
associated with the call. The latter approach resulted in task programs 
that no longer contained processing logic for a single signaling type or 
for incoming or outgoing trunk. Incoming trunk programs made deci- 
sions based upon the type of outgoing trunk associated with the call 
and vice versa. For example, the ccIs task programs contained MF and 
DP signaling logic, etc. This approach was generally taken to save real 
time or to minimize program interfaces. 

The drawbacks to such a task program design were: (i) proliferation 
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of decisions; (zi) duplication of program functions; (iii) dilution of 
program cohesion; and (iv) loss of independence between incoming 
and outgoing trunk. In some cases, task programs were call controllers 
and in others, single trunk controllers. The interfaces between incom- 
ing and outgoing trunk program became many and complex as shown 
in Fig. 3. The task program interfaces with the other operational 
programs further complicated the picture. 


2.3 Motivation for restructure 


With the development of the 4E3 generic for the No. 4 Ess, call 
processing was enhanced to provide the International Gateway Ex- 
change Feature. This new feature required the addition of ccirr No. 
5 and ccitT No. 6 signaling capabilities to call processing. Two new 
incoming and outgoing task programs were required along with the 
modification and retest of all existing task programs. Rather than add 
these signaling types to the existing architecture, thus further compli- 
cating an already complex structure, we considered restructuring the 
call processing task programs. 

The goal of restructuring was to minimize the drawbacks of the 
current design, while at the same time, to minimize the effects of 
restructure upon the existing task programs which were known to be 
real-time efficient and virtually error-free. Eliminating duplication, 
strengthening program cohesion, and true separating incoming and 
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Fig. 3—Task program internal interfaces. 
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outgoing trunk processing were of special importance since future 
generics are very likely to require additional call handling task pro- 
grams for new signaling types. 


2.4 Approach to restructure 


The key problem to be resolved with the restructure effort was how 
to make incoming trunk and outgoing trunk programs truly independ- 
ent. The process began with formulating a universal call (shown in Fig. 
4) and identifying the call events that must be processed to complete 
the call. At this point no effort was made to distinguish incoming trunk 
from outgoing trunk. The main focus was on the overall call. Seven 
call events were identified: 

(t) Origination, 

(wz) Digit reception, 

(zzz) Outgoing trunk selection, 

(tv) Origination on the outgoing trunk, 

(v) Digit sending, 

(vi) Receive answer, 

(vit) Receive disconnect. 

The architecture began to materialize as a result of functionally 
decomposing the universal call into three sequential call stages: 

(t) Setup—that part of the call from seizure on the incoming trunk 
through digit sending and connection of the incoming and outgoing 
trunks. 

(ti) Post Setup—that part of the call during which time a voice 
path is connected, while awaiting answer and in the talking state. 

(tit) Clearing—hardware and software trunk idling sequences after 
call termination. 

The three sequential call stages were further decomposed into 
incoming trunk (IcT) and outgoing trunk (OGT) processes, resulting in 
the functional decomposition shown in Fig. 5. 

The final phase of the process addressed the basic problem of 
isolating ICT and OGT processing. A new program function was created 
to consolidate the communication interfaces between IcT and oGT task 
programs and to oversee common call related functions. This program 
was called the Report Dispenser. Its inclusion in the new architecture 
made it possible to remove from the task programs any trunk signaling 
logic dealing with the other trunk involved in the call and to create 
trunk handlers. 

The Report Dispenser was the single most important addition to 
the call processing architecture, because it introduced the use of logical 
signals as the means of communication between trunk handlers. Trunk 
handlers could now communicate with the Report Dispenser without 
involving another trunk handler. Incoming trunk and outgoing trunk 
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Fig. 4— Universal call flow diagram. 
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Fig. 5—Functional decomposition of a universal call. 
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processing became independent. The complex interface between in- 
coming and outgoing trunk programs had been replaced with a stan- 
dardized interface that used logical signals as a means of communicat- 
ing with a central point of control wherein call-related decisions 
requiring knowledge of the other trunk were made. The new architec- 
ture became one where task programs/trunk handlers communicated 
with trunk circuitry via physical signals and communicated with other 
trunk handlers via logical signals. 


2.5 New architecture overview 


The primary features of the new architecture are as follows: 

(t) The splitting of call handling into parallel real-time processes 
(finite state machines), which control states of the incoming trunk, the 
outgoing trunk selection process, and the outgoing trunk. 

(tz) The consolidation of communication decisions, which link these 
finite state machines in a program called the Report Dispenser. 

(uit) The identification of a subset of call handling functions that 
can be implemented as subprocesses (submachines) under control of 
incoming or outgoing trunk handlers. These functions were common 
to most calls and relatively independent of signaling type. They are 
digit reception and digit sending. 

An architecture based upon trunk handlers is advantageous from 
the standpoint of minimal impact upon the existing call processing 
task programs. The basic logic of the task programs can be maintained; 
the changes are limited to separating incoming and outgoing trunk 
functions, eliminating redundant code, and interfacing the task pro- 
grams with the Report Dispenser. Figure 6 illustrates the major 
modules in the new architecture and the control hierarchy. 

All task dispenser reports are made on a trunk state basis. This 
means that report dispensing is based strictly on the state of one trunk 
involved in a call to the trunk handler responsible for handling that 
type of trunk. 

All internal and external stimuli are dispensed by the task dispensers 
in the same manner as existed in the pre-restructure system. A new 
task dispenser was added as part of the restructure effort to interface 
with the cciTT No. 6 signaling terminal. 

When a trunk handler receives a physical signal from the task 
dispenser it takes whatever action is appropriate and then reports a 
logical call event to the Report Dispenser. The Report Dispenser 
determines the next call action to initiate based upon the logical event 
and may invoke per call common functions, such as outgoing trunk 
selection, or invoke the other trunk handler involved with the call. 
When the signal has been completely processed, control is returned to 
the task dispenser via the Report Dispenser and the trunk handler 
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that initially received the stimulus. In summary, a physical signal is 
passed to the trunk handler, which converts the signal to a logical 
signal (answer, disconnect, etc.) based upon the state of the trunk. The 
logical signal now becomes the stimulus to the Report Dispenser to 
stimulate further call processing actions. 

The block called Common Call Functions in Fig. 6 consolidates 
many common call related functions and interfaces which in the pre- 
restructure architecture were spread throughout the task programs. 
Many of the interfaces with the other operational programs are con- 
solidated here. 

The Digit Reception and Digit Sending functions appear as sub- 
machines under the incoming and outgoing trunk handlers. They are 
programs which are invoked by the trunk handlers and are logical 
rather than physical signal driven. Task dispenser reports are directed 
to these submachines and not to the trunk handlers. This allowed for 
efficient real-time execution in the processing of these reports and 
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does not burden the trunk handlers with detailed knowledge of how 
the submachine performs its function. 

Call Failure Control has the same relative position in the new 
architecture as the Report Dispenser and is responsible for controlling 
the clearing of incoming and outgoing trunks as a result of Ineffective 
Attempts, ie., calls that are not successfully completed. 


2.5.1 A simple call 


To more clearly understand the structure, interfaces, and control, 
we describe a simple pp-to-pP call. We incorporate only those events 
needed to successfully complete the call because the picture becomes 
more complicated when call anomalies are taken into consideration. 
The scenario is based upon the universal call diagram. Figure 7 
represents the call flow diagram for the call. Incoming Trunk actions 
are represented along the top horizontal axis. Logical events are 
reported to the Report Dispenser, which then communicates these call 
events to the oct. Outgoing trunk actions are represented along the 
bottom horizontal axis. The call actions progress in sequence from left 
to right. 

The call begins with the receipt of an off-hook origination on the 
idle ict. The physical off-hook signal is passed from the task dispenser 
to the 1cT handler which prepares for digit collection. When the IcT is 
ready to receive digits, an integrity check signal is sent backward 
toward the originating office. The next IcT action is to receive digits. 
This action is performed by the Digit Reception Program. The Digit 
Reception Program also analyzes the digits to determine the outgoing 
trunk group for the call. The Report Dispenser is now notified that 
the call is ready for ocT selection. The Report Dispenser invokes the 
oGT selection program which is a common call function. After a 
successful return from the ocT selection program, the Report Dis- 
penser invokes the ocT handler. The first oGT handler action is to seize 
the oct. After seizing the trunk, the Report Dispenser is informed that 
seizure is complete. For this call, no IcT action is required at this point. 
Action is required if the IcT is ccIs or cciTT No. 6. This knowledge 
resides only within the Report Dispenser. The oct handler waits for 
receipt of the integrity check signal from the far end office indicating 
readiness to receive digits. The ocT handler invokes the Digit Sending 
program, which deletes or prefixes digits to the dialed number and 
controls the outpulsing process on the oct. When all digits have been 
outpulsed, control passes back to the Report Dispenser indicating the 
outgoing part of the call is complete. At this point, the call moves from 
the setup stage to the post setup stage. The cr is released and the 
voice path between IcT and oGT is completed. Both actions are com- 
mon call functions. Each trunk handler places itself in the waiting-for- 
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Fig. 7—Dial pulse-to-dial pulse call flow. 


answer state. The next signal the No. 4 Ess expects to see is off-hook 
answer on the oct or on-hook clearforward on the IctT, should the 
originator disconnect. In the case of the answer signal, the oct handler 
processes the signal for the ocT and reports the logical event to the 
Report Dispenser, which in turn, passes the event to the 1cT handler 
for processing. The next event in the call will be an on-hook disconnect 
on either trunk. 

If the 1cT receives an on-hook clearforward, the 1cT handler reports 
the event to the Report Dispenser which invokes the IcT and oGcT 
clearing routines to idle the trunks. If an on-hook clearback signal is 
received by the oat, the ocT handler passes the event to the Report 
Dispenser, which invokes the 1cT handler to send a clearback signal on 
the 1cT. A clearback does not cause the call to be idled. A clearforward 
must be received to idle the call. 

This simple example of the Dp-to-pP call demonstrates how the 
Report Dispenser isolates the IcT and ocT from having knowledge of 
the other. We can then extend this example to cases where the IcT and 
oct are of different signaling types and show that by communication 
with the Report Dispenser using logical call signals any type of IcT can 
interwork with any type of oGT, given the necessary logical signals 
have been defined. 


2.5.2 The report dispenser 


Communication between the trunk handlers is consolidated in the 
Report Dispenser. When a trunk handler detects a logical event that 
may be significant to the other trunk handler on that call, it reports 
that event to the Report Dispenser. The Report Dispenser determines 
the other trunk handler on the call and passes control to the appro- 
priate trunk handler. This consolidation of what are primarily signaling 
type decisions about the other trunk handlers involved in the call 
results in an overall reduction of code and simplifies the addition of 
new signaling types. The addition of a new signaling type to this call 
processing system obviously involves the design and development of 
an IcT and ocT handler. However, if the new signaling type does not 
require the addition of any new logical signals, then the Report 
Dispenser only requires slight modification to include the new signaling 
type. 

The trunk handlers communicate with the Report Dispenser by 
means of logical signals and pass additional data with the cr and TR. 
The signals are divided into two categories: setup and post setup. 
Setup signals are passed to the Report Dispenser, along with the cr, 
during the setup stage of a call. The signaling type of the IcT and ocT 
and the state of the call are stored in the cr. Based upon cr data and 
the logical signal, the Report Dispenser makes the decision on what to 
do next in the processing of the call. Post setup signals are passed to 
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the Report Dispenser in the post setup stage of the call, which is after 
the cr has been released. The Trunk Scanner Number (TSN), which 
identifies the IcT or OGT, is passed along with the logical signal. 
Through data translations using the TSN, the signaling type of the 
trunk and the TR are found. Based upon the signaling type of the trunk 
and the logical signal, the proper next step in the call can be taken. 
This mechanism then allows IcT and ocT handlers to have no knowl- 
edge about the other trunk involved in the call. That knowledge, along 
with the knowledge of the state of the call, resides within the Report 
Dispenser. 

There are several functions in call processing that are call-event 
dependent and signaling-type independent. The Report Dispenser 
provides a central point for calling procedures associated with these 
common call functions. The common call functions include: 

(t) Voice path setup and take down. 

(ti) Interfacing with the No. 4 Ess Service Observing System, if it 
is active on a call. 

(ii) Interfacing with the No. 4 Ess Network Management Programs. 

(tv) Call register release. 

(v) Interfacing with the No. 4 Ess Inward Wide Area Telecom- 
munications Service Billing Program. 

(vi) Interfacing with the No. 4 Ess Call Detail Recording Program 
for international calls. 

The advantages of calling common call functions from the Report 
Dispenser instead of assigning that responsibility to the trunk handlers 
are as follows: 

(t) minimization of errors—the functions can be called from a 
single module; 

(zi) reduction in real time—such functions generally required the 
knowledge of both trunks, information which the Report Dispenser 
had available but would have to be regenerated in a trunk handler; 

(zit) elimination of code—the functions can be called from a single 
module; 

(tv) simplification of changes or additions to event-dependent func- 
tions—a significant advantage when adding features that are signaling- 
type independent. 


2.5.3 Call failure handling 


Call failure handling or final handling is the name given to the 
cleanup process for calls that fail to complete in a normal manner. 
Calls can fail due to machine error (hardware or software), customer 
error (misdialing, early abandon), or network conditions (congestion, 
network management controls). 

There is a general class of events in the No. 4 Ess known as call 
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irregularities, which cause either a retrial attempt, or an abnormal 
termination of the call. An abnormal termination is called an ineffec- 
tive attempt. Most ineffective attempts are because of an inability to 
complete the setup stage of a call. Some examples are as follows: 

(t) IcT abandon in the setup stage. 

(tt) Network congestion (all circuits busy, network management 
controls). 

(iui) Failure on retrial attempts (glare, outpulsing errors, integrity 
check failures). 

(tv) Office congestion (no cRs or service circuits, network blockage, 
overload controls in effect). 

Some ineffective attempts occur in the post setup stage, such as loss 
of transmission (carrier failure). 

Final handling clears ineffective attempts, allowing call processing 
resources (CR, trunks, service circuits) to be reused for new calls. 
Announcements and tones are also provided to help inform the cus- 
tomer of the situation. 

There are numerous states that a call could be in when final handling 
is required. A call could be using many combinations of machine 
resources (i.e., CR timing lists, service circuits). Rather than determine 
the exact state of a call and idle only those resources and processes 
associated with that state, final handling checks for and idles all 
possible resources and processes on a call. In this way, calls can be 
cleared that have invalid states or invalid resources associated with 
valid states. 

Final handling can be thought of as having two components, a call 
failure controller and a set of trunk clearing modules. The call failure 
controller holds a position in the architecture equivalent to the Report 
Dispenser, and like the Report Dispenser performs functions associ- 
ated with common call related facilities (see Fig. 6). The trunk clearing 
modules are part of each trunk handler and provide a customer 
treatment based upon the trunk signaling type. 

The call failure controller could have been made part of the Report 
Dispenser and final handling conditions treated via the same logical 
event-type interface that trunk handlers have with the Report Dis- 
penser. However, the call failure controller already existed in the pre- 
restructure architecture and changing this interface would have had a 
major impact on the existing trunk handlers. A logical event-type 
interface like that of the Report Dispenser was provided in the call 
failure control module to accommodate the ccitT No. 5 and ccitT No. 
6 trunk handlers, since they were new task programs to be developed 
during restructuring. 

When a call requires final handling, the trunk handler interfaces 
with the final control module, which clears common facilities and 
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invokes the particular trunk clearing modules to idle remaining trunk- 
related facilities and to provide proper customer treatment for the call. 


2.6 Evaluation 

The interfaces and direction of communication between the trunk 
handlers, the task dispensers, and the Report Dispenser have become 
call processing programming standards. In some cases these standards 
produce a call flow which sacrifices real-time efficiency for the sake of 
uniformity. However, the sacrifice of real time is justified to maintain . 
the integrity of the architecture. The analysis of call processing pro- 
gram errors and the changes required for program correction are a 
much simpler task because of easier problem isolation. The architec- 
ture makes the addition of new signaling types and design changes a 
more quantifiable job. The placement of new modules becomes readily 
apparent in the structure because the architecture directs the designer 
to a specific process of functional decomposition. The new signaling 
type is separated into IcT and OGT processes. Each process is then 
further decomposed into setup, post setup, and clearing functions, and 
new logical signals, if any, are identified. This process to a degree 
forces a consistent approach to the first level of task program modu- 
larity. 

Since the call processing restructure was incremental, major portions 
of the existing code were not redesigned or rewritten. Existing task 
programs were not totally reorganized into distinct setup, post setup, 
and clearing modules. However, additional reorganization continues as 
new features are added to the call processing task programs. The 
implementation of the new ccitT No. 5 and cciTtT No. 6 trunk handlers 
followed the call processing program standards completely. In addition, 
ccITT No. 6 was implemented with the use of EPLX. 

As part of the 4E5 generic, the Mass Announcement System (MAS) 
feature was added to the No. 4 Ess. The Mas feature required a number 
of new types of calls to be processed by the No. 4 Ess and was a major 
software development undertaking in call processing. Using the struc- 
ture of the new architecture as the basis for MAS feature decomposition 
and design, changes were made to add MAS to the call processing 
system. The feature addition was successful. Many of the new MAS 
calls executed correctly soon after introduction for testing in the 
system laboratory environment. At the same time, the old call types 
that the No. 4 Ess previously accommodated remained intact with no 
errors introduced as a result of the MAS feature addition. 

The restructuring effort did not go without problems, the prime 
being increased real-time usage. After the architecture was solidified 
and much of the software developed, certain real-time critical parts 
were reviewed and optimized until real-time performance was judged 
to be within reason. As real-time improvements were made to the 
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EPLX, additional changes were made to certain areas of call processing 
to further improve real-time performance. 


Il. MAINTENANCE SOFTWARE RESTRUCTURING 


Peripheral maintenance software for the No. 4 Ess also has been 
selectively restructured to minimize the cost of developing such soft- 
ware and provide the ability to continue to add new hardware features. 
The restructured peripheral fault recovery system incorporates oper- 
ating system concepts, top-down hierarchically designed control struc- 
tures, and use of a formal development methodology. This section 
gives a brief overview of the pre-restructured maintenance system. We 
also give error recovery and system recovery concepts, the motivation 
behind restructuring a selective part of the maintenance system, and 
finally a description of the restructured system, and an evaluation of 
the benefits of the new system. 


3.1 Maintenance system overview 


The stringent reliability and maintainability requirements of the 
No. 4 Ess affect both the hardware and software design of the system. 
In the software, we have developed a large program package to provide 
maintenance functions.® This maintenance software package consists 
of four functional areas that play an essential role in providing the 
maintenance capabilities of the No. 4 Ess: (7) fault recovery; (iz) 
diagnostics; (diz) system reinitialization and recovery; and (iv) system 
integrity and audits. Fault recovery is concerned with the system 
recovery from hardware faults. Diagnostics aid the craftperson in the 
identification of faults and repair of a faulty unit. System reinitializa- 
tion is concerned with the overall coordination of system recovery 
from multiple or severe hardware and software malfunctions. The 
system integrity area is concerned with the detection of and recovery 
from memory mutilation. These software areas are designed based on 
specific error recovery and system recovery concepts. We give these 
concepts in later sections for background information. 

Various types of redundancy (e.g., duplex, m + 1 duplication, n + 2 
duplication, and load sharing) are used in the different subsystems to 
meet hardware reliability requirements. Each subsystem uses a num- 
ber of error-detection techniques such as parity, matching, order 
acknowledgment, and self checking. These hardware characteristics 
place specific requirements on the maintenance software, particularly 
the fault recovery area, which is tightly coupled to hardware design. 


3.1.1 Error recovery concepts 


The maintenance software is built around several levels of execution 
based upon both maintenance software and hardware error detection 
triggers. Table I shows the maintenance program execution levels in 


1184 . THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1981 


Table I—Maintenance program execution levels 





Layer Level Function 
Phase 4 
System Phase 3 System 
Recovery Phase 2 initialization 
Phase 1 
A 
c Processor 
Maintenance D fault recovery 
Hardware E 
Interrupts F 
Peripheral recover 
G HP tad 
K — Utility and timing 
Segment timing validation 
Maintanenace Interject 
Safiaare | eit Fault recovery 
Manual requests 
Audits Base Low-priority task 
Diagnostics 


the No. 4 Ess. Only those execution levels which are applicable to 
peripheral fault recovery are discussed. The remaining levels of exe- 
cution are presented in Ref. 4 on the 1A Processor. 

Base level is the lowest and the normal level of system execution. 
All the call processing work described earlier, as well as audits and 
diagnostics, are normally executed at this level. Base level maintenance 
(BLM) is the next level and is triggered by defensive checks provided 
in software or firmware. Interject level is the next higher level and is 
guaranteed to be served by the 1A Processor every 10 ms. F-level 
interrupts report peripheral errors and are of two types: peripheral 
unit failure (PUF) and autonomous peripheral unit failure (APUF). The 
PUF interrupt is generated by the 1A Processor when a peripheral 
frame fails to acknowlege, or incorrectly acknowledges a directed order. 
The APUF interrupt is generated autonomously by a peripheral unit 
failure. The 1A Processor scans for APUFs every 11.2 ys. Base level 
maintenance, interjects, and both types of F-level interrupts invoke 
peripheral fault recovery. Fault recovery actions can also be stimulated 
by manual requests, such as input messages or power control switch 
requests. 


3.1.2 System recovery concepts 


When fault recovery succeeds in reconfiguring the system so the 
faulty unit is not in service, repair activity commences. However, in 
cases when complex or multiple faults prevent fault recovery from 
configuring an acceptable working system, system recovery actions are 
taken. Phase recovery is the highest level system recovery action and 
can be initiated either manually or by software. Phase recovery can 
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escalate through four phases, where phase 1 is the least severe and 
phase 4 is the most severe. Phase 4 can only be requested manually. 

There are two types of Phase 1s. The first type executes a specified 
set of audits that correct data mutilation. The second type is a directed 
phase 1 and is initiated by a fault recovery action on a peripheral unit 
which results in a loss of service provided by the unit. The directed 
phase 1 initializes software structures associated with the faulty unit. 
A phase 2 initializes additional software structures and also performs 
a unit access test on the peripheral hardware when it is initiated by F- 
levels. Phase 3 is the highest level phase that is automatically re- 
quested. It performs additional software structure initialization and 
additional tests on the hardware. A phase 4 performs a total system 
initialization and can only be requested manually. 


3.2 Motivations for a modern structure 

3.2.1 Drawbacks of original implementation 

Since the initial generic release (termed 4E0), each new generic has 
included new features, hardware cost reductions, and enhancements. 
Each generic must continue to meet the original design objectives of 
the system for capacity and reliability and, at the same time, provide 
new services, and take advantage of the rapidly changing technology 
through hardware cost reductions. By the end of the second generic 
(4E1), the continuing demand for new hardware features and cost- 
reduced hardware was evident. We, therefore, determined how the 
development cost could be reduced for maintenance software. Of the 
four maintenance software areas described earlier, we found the fault 
recovery area to be affected most by new hardware feature develop- 
ment since efficient changes or additions could not be made. 

The principal reason the pre-4E2 fault recovery software exhibited 
this lack of flexibility was that it was functionally partitioned with a 
decentralized control structure. Figure 8 shows the functional parti- 
tions used in the pre-4E2 fault recovery software. They include: 

(t) Peripheral Configuration Program, 
(tt) Craft-Machine Interface Program, 

(tit) Hardware Phase Recovery, 

(tv) Error Analysis Program, 

(v) Per unit fault recovery programs. 

Each functional area contained control and unit-dependent code for 
many units embedded in the same programs. Each time a new unit 
was added, the functional area was modified, resulting in increased 
complexity, and requiring the entire area to be retested. Each one had 
decentralized control and had to provide for the following common 
requirements: 

(zt) Multilevel execution, 
(zz) Unit dependent interfaces, 
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Fig. 8—Peripheral fault recovery structure (4E2 generic). 


(iit) Hardware and software coordination, 

(tv) Reentrant software. 

Fault recovery software is required to execute on all system levels of 
execution, i.e., base level, BLM, interject, F-level interrupt, and phase 
level. In general, other maintenance software executes only on one 
level. In addition, fault recovery software is required to interface to all 
types of peripheral hardware. Hardware coordination is required be- 
cause of the highly interconnected hardware. In particular, recovery 
from a problem in one unit generally affects other units. Software 
coordination is required to prevent software interaction due to time- 
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shared execution on base level. Reentrant software is required due to 
the multilevel execution of fault recovery software. As an example, the 
same fault recovery software may be started and successively re- 
started by escalating recovery actions. These requirements resulted in 
excessive interfaces and interaction between the different functional 
areas. Consequently, much fault recovery code was duplicated in an 
effort to reduce the number of interfaces. However, duplication made 
the job of maintaining the software much more difficult. 

Each unit fault recovery program was responsible for many functions 
common to fault recovery of several units. The functions were being 
performed by several different programs in numerous ways. For ex- 
ample, each unit fault recovery program provided interfaces to each 
functional area, provided software to collect common recovery data, 
provided software to output recovery messages, etc. 

Since most fault recovery software executes on interrupt level, it was 
designed with emphasis on real-time efficiency to minimize the inter- 
ruption of base level due to a faulty unit. Techniques such as “tricky 
code” and private interfaces, as examples, were used for real-time 
efficiency. This also contributed to a structure that was difficult to 
change. 


3.2.2 Development of improved structure 

In response to these shortcomings in the pre-restructured fault 
recovery structure, an improved fault recovery structure was developed 
to incorporate the following: (7) a peripheral maintenance operating 
system; (11) new hierarchically designed fault recovery control struc- 
tures; (ii) a higher-level language; and (iv) a more formal development 
methodology. 

The operating system would remove some complexity from the 
software by handling multilevel execution, memory allocation, and 
software coordination, and provide a truly standard interface between 
functional areas of fault recovery software. 

The hierarchically designed control structure would provide com- 
plete separation between control and unit-dependent code. This would 
remove much of the unnecessary complexity in the control areas and 
limit the testing mainly to the new feature software being added. A 
hierarchical structure would lend itself more easily to changes and 
additions. It would improve readability and maintainability of the 
product. 

A high-level language would improve programming productivity, 
readability, and maintainability. Programming productivity would be 
improved by allowing the programmer to concentrate on programming 
the function and not on initializing and saving registers, implementing 
loops, etc. Removing this level of detail from the source code would 
also improve the readability and maintainability of a program. 
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A formal development methodology would provide uniform and up- 
to-date documentation. The more rigorous steps in a methodology 
that insist on requirement reviews, design reviews, code walk-throughs, 
and test plan reviews help ensure that more software bugs are found 
early in the development. Other benefits of this formal development 
methodology, which uses development teams, are better project visi- 
bility and a larger group of people with knowledge of the software. 

The development cost of the operating system and new control 
structures could be spread over several generics with little additional 
development cost beyond that required to add new units. Once the 
operating system and control structure were in place, the development 
costs for a new hardware-related feature would be reduced. In addition, 
program maintenance cost would be reduced. 

Each of the above techniques, to some degree, has the drawback of 
less real-time efficiency and greater program size. The advantages 
stated above were judged to outweigh these considerations. It is also 
common practice when using a structured design approach to optimize 
after the design is working. Time should be scheduled for optimization 
when it can be determined which areas require real-time and program- 
size optimization. Note that optimization is generally easier in a 
structured design that is written in a high-level language. In many 
cases, large improvements in real-time and program store usage can 
be accomplished by small changes in a structure and/or compiler. 
Also, the increased program size is partially offset by reduced tempo- 
rary Memory requirements. This reduction can be attributed to more 
efficient use of temporary memory by the new operating system. 

The fault recovery software, thus, evolved to a set of centralized 
control structures executing under a maintenance operating system. 
These control structures are designed with complete separation be- 
tween control and unit-dependent code. Both maintenance control and 
unit-dependent software are written in EPLX. To add units to this new 
system, unit-dependent modules are added to each control structure 
as illustrated in Fig. 9 by the dashed blocks. In general, no modification 
is needed to the control structures. This results in testing the new unit 
software and little regression testing. In the pre-restructured fault 
recovery system each functional area required modifications to add 
the new control code and unit code. In Fig. 8 the blocks with the 
blacked-in corners are the functional areas that required extensive 
changes and additions. Each functional area required testing of the 
new unit fault recovery software and extensive regression testing of 
existing unit fault recovery software. 

The new fault recovery system was planned to be evolved over 
several generics and to operate in parallel with the pre-restructure 
fault recovery system. The pre-restructure fault recovery system con- 
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Fig. 9—~Peripheral fault recovery structure (4E5 generic). 


tinues to handle the units it was designed to accommodate. New units 
are being implemented under the new fault recovery system in the 
EPLX language. 


3.3 Characterization of new fault recovery control architecture 


The development of the new fault recovery control architecture was 
a multigeneric development. This new architecture was developed as 
a parallel system without disturbing the existing fault recovery soft- 
ware system, which supports the existing peripheral hardware archi- 
tecture. The initial introduction was in the 4E3 generic with the 
development of the Peripheral Maintenance Operating System (PMos) 
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and the Peripheral Unit Fault Recovery (PUFR) control structure. The 
first units supported by this system were the network frames (Time 
Slot Interchange and Time Multiplexed Switch). In the 4E5 generic, 
five new control structures were added, plus unit-dependent code for 
four new hardware frames. The five control structures were: (i) Toll 
Peripheral Configuration (ToPIc); (iz) Frame Request and Diagnostic 
Interface (FRDI); (zi) Failure Error Analysis (FERA); (iv) Message 
Dispenser and Coordinator (MEDIC); and (uv) Bootstrap Control 
(BOOTCNTL) Program. 

Fach of the control structures, with the exception of MEDIC, was 
recommended as part of the original plan. Message Dispenser and 
Coordinator is a control structure which resulted from the introduction 
of intelligent (micropressor based) peripherals into the No. 4 Ess. 
These new peripherals execute macro-level orders which return mul- 
tiword responses on a deferred basis after control is released. This 
required a new structure to control sending, receiving, and dispensing 
responses from these units. All four of the new hardware frames 
developed for 4E5 were of this type. 

The resulting fault recovery software structure after the 4E5 generic 
is shown in Fig. 9. Each of the control structures was designed to meet 
the following objectives: 

(t) Remove complex system dependencies by making use of the 
PMOS. 
(it) Make use of a more formal development methodology. 

(tit) Use EPLX. 

(tv) Provide complete separation between control and unit-depend- 
ent code. 

(v) Provide standard unit-dependent interfaces. 

(vt) Remove limitations (structure sizes, number of units, etc.) 
which exist in the present fault recovery software system. 

(vit) Provide new capabilities. 

Section 3.3.1 briefly describes each functional area of the new control 
architecture. 


3.3.1 Peripheral maintenance operating system 

The pmos is the heart of the new fault recovery control architecture. 
This operating system centralizes peripheral maintenance control 
and coordination while reducing the complexity of system interaction. 
The operating system provides a standardized interface between PMOS 
tasks and the remainder of the system software. A PMos task is a 
software process or function defined to the operating system. This 
interface allows an operating system task to be requested with several 
options specifying levels of execution, request mode, and priority. For 
example, simultaneous tasks can be scheduled on the same level or 
different levels, requested in a schedule and hold mode, parallel sched- 
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ule, or run-immediate mode. The operating system provides for task 
execution on base level, BLM, interject level, F-level, and phase level. 
Schedule and hold mode allows a task to schedule other tasks and be 
suspended until the other task is completed. Parallel schedule allows 
a task to schedule several other tasks and be suspended until all are 
completed. Run-immediate mode allows a task to request other tasks 
to be executed immediately. The main features of the operating system 
are: (1) task coordination; (iz) multilevel execution; (zi) administration 
of segment breaks; and (iv) reentry. 

Peripheral fault recovery code for several units tends to be tightly 
coupled due to highly interconnected hardware units. The operating 
system removes from a task many of the concerns of task interaction 
by providing several task coordination functions. The operating system 
exercises blocking rules as defined in a central blocking table. Blocking 
prevents time-shared execution of specific tasks which otherwise would 
interact. Control of abort conditions and the execution of abort pro- 
cedures are also provided. Control of execution and the determination 
of associated priorities are also included in a task coordination function. 

Multilevel execution is a characteristic which in the past required 
numerous redundancies in many fault recovery programs. For example, 
each fault recovery program was required to check for the execution 
level and perform the necessary function to segment on that level. The 
operating system consolidates the necessary checks and functions to 
execute on different levels in one place. In general, tasks need not 
know what execution level they are on. 

Segment breaks* required by base level processing add complexity 
and substantial development cost without a unified control architec- 
ture. Peripheral Maintenance Operating System provides segmenta- 
tion routines for the new control structures. These routines preserve 
task memory when segment breaks are taken and ignore segment 
breaks on interrupt, interject, and phase level. The operating system 
also provides routines for timing breaks. Timing breaks at any execu- 
tion level releases the operating system for execution of other tasks 
until the time specified at the break has expired. The task environment 
is preserved on segment breaks or timing breaks and reestablished 
upon return to the task. 

Reentry is a condition that causes numerous problems for multilevel 
maintenance software. This problem arises, for example, when a mul- 
tilevel program is interrupted on base level and the same program is 
entered on the interrupt. This can result in variables, initialized on 
base level, being overwritten on the interrupt level. This problem has 


* Segment break is a convention in No. 4 Ess whereby all base level processing 
programs are required to return control to the Executive Control program every 3 ms. 
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forced fault recovery software to be exceedingly defensive during 
execution, adding system integrity checks and numerous other controls 
to avoid problems. The operating system resolves each case of reentry 
to the interrupted program. The integrity of each task is maintained 
either by aborting a task or allocating a different memory block to the 
task. 


3.3.2 Message dispenser and coordinator 


The MEDIC is a control structure developed to satisfy new require- 
ments introduced with microprocessor-based frames in the peripheral 
system of the No. 4 Ess. Prior to 4E5, all peripheral frames on the 
peripheral unit bus returned responses to orders in the peripheral unit 
bus reply window. This window is 32 1A Processor cycles or 22.4 us in 
duration. With the introduction of microprocessor-based frames, their 
macro-level orders required much longer times to complete because of 
the higher-level function being performed. These frames were designed 
to return an initial response in the reply window, indicating the order 
was accepted. A “task complete” response was returned when the 
macro work was completed within the frame. 

Message Dispenser and Coordinator was developed as a control 
structure, having special interaction with the operating system. The 
basic functions of MEDIC are to (Z) coordinate sending macro orders to 
microprocessor-based frames; (iz) poll these frames for responses on a 
deferred basis; and (iii) dispense those results to the appropriate client. 
The message dispenser, in conjunction with the operating system, 
provides primitives (low-level function calls) which allow a PMos task 
to be suspended while waiting for a macro response. The task is 
automatically reactivated when the macro response is received. MEDIC 
provides a macro timeout notification. If a response is not received in 
a predefined maximum allowed time, the task is notified. The message 
dispenser also provides appropriate handling of unsolicited frame 
reports and autonomously generated reports. The unsolicited and 
autonomously generated reports are processed on BLM. The fault 
recovery program resolves the cause of the report and takes the 
appropriate recovery action to clear the problem. 


3.3.3 Peripheral unit fault recovery 


The puFR control structure was the first control structure developed. 
It was developed in the 4E3 generic and supported the cost-reduced 
tsi frames. The Peripheral Unit Fault Recovery is a common control 
structure that handles all levels of peripheral error recovery (BLM, 
interject, and F-level). It consolidates common error-recovery func- 
tions under one control program by providing the following common 
functions: (z) initialization of data structures; (iz) collection of critical 
data required to isolate the source of a fault; (ziz) an interface to unit- 
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dependent tasks to isolate the fault; (iv) an interface to error analysis 
programs to acknowledge the recovery actions; (v) execution of the 
necessary actions to recover the system; (vi) scheduling of any deferred 
maintenance actions, e.g., diagnostic, audits, etc.; and (viz) printing of 
reports containing critical data and the recovery action taken at the 
time of the fault. 

The Peripheral Unit Fault Recovery controls the execution of fault 
recovery by calling both common routines and special unit dependent 
procedures. It satisfies the requirements of complete separation of 
control and unit-dependent software by providing standard interfaces 
to unit-dependent procedures. It calls the appropriate unit procedure 
by indexing a table based on unit identity. In addition to consolidating 
the common functions, PUFR also provides several enhancements. 
Some of these enhancements are: (1) multiple isolation attempts; (iz) 
multiple unit interface to error analysis; and (iii) enhanced report 
messages. Multiple isolation attempts allow a unit isolate program to 
request an isolation attempt on a different unit, for the same interrupt, 
when the problem cannot be resolved to the original unit. The subse- 
quent isolation attempt is usually on a connecting frame. Multiple unit 
interface to error analysis allows the unit isolation program to pass a 
list of suspect units to error analysis when the problem cannot be 
resolved to a single unit. Enhanced report messages provide the craft 
with additional information concerning the source of the interrupt and 
the corrective action taken. 


3.3.4 Failure error analysis 


The FERA program provides the centralized control structure for 
carrying out the fault recovery error analysis function. The main role 
of error analysis in the No. 4 Ess is listed below: 

(t) Complement fault recovery by adding the element of interrupt 
history, 
(ii) Resolve intermittent and transient hardware faults, 

(tit) Resolve faults in interconnected hardware, 

(tv) Isolate persistent or intermittent system troubles in highly 
interconnected hardware subsystems, 

(v) Record and analyze error history information, 

(vi) Provide graceful degradation by removing units, which causes 
the minimum service effect, to correct a system problem, 

(vit) Monitor deferred maintenance activities to guard against re- 
moving the redundant part of an intermittent failing piece of equip- 
ment. 

Failure Error Analysis provides these functions by determining 
recovery actions with strategy tables. Strategy tables are a collection 
of decision schemes which make different decisions on successive 
occurrences of an error. A strategy table is selected by fault recovery 
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based on the type of fault occurring. The Analysis can acknowledge 
and accept the action recommended by PUFR or recommend an alter- 
nate action. The decision schemes and the selection of a strategy table 
use several factors to reach a decision: 

(z) environment of the configurable portion of the system (simplex, 
duplex), 

(iz) number of times the fault has occurred, 

(tit) type of fault (unique, nonunique), 

(tv) characterisic of the fault (transient, hard failure, illegal system 
action). 

The Analysis also provides control for alternate recovery strategies. 
This control provides better analysis functions to be performed. An- 
other new feature is a parallel analysis capability, which allows a 
strategy table and analysis function to execute in parallel. If the 
analysis function reaches a conclusion, it can override the action 
recommended by the strategy table. These new strategies allow FERA 
to resolve intermittent faults, transient faults in interconnected hard- 
ware, and persistent troubles more efficiently than the existing error 
analysis program. These enhancements were provided in addition to 
meeting the common objectives of all the new control structures. 
Secondary functions provided by FERA are (Z) monitoring manual 
configuration requests; (ii) monitoring deferred maintenance actions 
(diagnostics, routine exercises); and (viz) manual input/output for 
control and display of FERA functions and data. 


3.3.5 Craft-machine control program 


The craft-machine interface functions are provided by the FRDI 
control structure. It provides the basic interface for manual configu- 
ration requests of the No. 4 Ess peripherals from either the TTY or 
Power Control Switch (pcs) located on the frame. Requests from the 
TTY may be for removal, restoral with diagnostic, restoral without 
diagnostics (unconditional), or for a switch of an active unit. Requests 
from the Pcs may be for removal of a unit or restoral with a diagnostic. 
When any manual configuration requests are initiated, FRDI validates 
the request, interfaces with the TOPIC program to perform the config- 
uration, interfaces with the FERA program to monitor the request, and 
prints the appropriate message to indicate whether the action was 
completed or denied. In addition to printing a message, if the request 
was initiated from a Pcs, lights at the frame are lighted or extinguished 
to acknowledge the request. 

The Frame Request and Diagnostic Interface is also the primary 
interface to the Diagnostic Control program for peripheral configura- 
tion before and after diagnostic requests. All diagnostic requests, 
independent of the source, are validated by FrD1. After the request is 
validated, the appropriate configuration function is requested. After 
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the diagnostic, FRDI controls the disposition of the unit by either 
restoring it or leaving it out of service. This decision depends upon a 
variety of conditions, such as the termination condition of the diag- 
nostic, the results of the diagnostic, the type of request, and the state 
of the Pcs. 


3.3.6 Configuration control 


The configuration control in the new peripheral fault recovery 
control architecture is provided by the Topic program. The Toll 
Peripheral Configuration program is responsible for establishing the 
configuration of the new peripherals introduced in the 4E5 generic. It 
is also responsible for the configuration of the Network Clock (NCLK) 
and System Clock (SyscLk). These latter units existed in the initial 
release of the No. 4 Ess generic, 4E0. However, with the addition of 
the Network Clock Synchronization Unit (Ncsu) in 4K5,° a major 
portion of the configuration software had to be modified and was 
moved into the new architecture. 

Beyond the primary function of accepting requests from all sources 
and directing configuration requests to the specific unit-dependent 
program, TOPIC will determine if there are any connecting unit consid- 
erations, for example, clock or voice data path dependencies. If there 
are, TOPIC will take appropriate action on the connecting units. It also 
attempts to leave the resultant peripheral system configuration in a 
state that minimizes service degrading conditions. This is also based 
on connecting unit status. 


3.3.7 Bootstrap control 


Bootstrap is a function executed during phases of recovery which 
include hardware configuration. The bootstrap function for the new 
units in the 4E5 generic is controlled by the BOOTCNTL program. The 
function of bootstrap is to initialize the hardware and execute access 
tests. The degree of initialization and access testing varies depending 
on the phase of recovery (phases 1 to 4). With the introduction of 
microprocessor based units in 4K5, a large portion of the initialization 
of these frames is performed by firmware resident within the frame. 
During this initialization process, the 1A Processor is free to perform 
other functions. Bootstrap Control, making use of features provided 
by the operating system, is free to start the bootstrapping of other 
units. This technique is referred to as parallel bootstrap. This process 
results in less total time required to bootstrap several frames than 
would be required if the function were executed in a serial fashion. 

Bootstrap Control also provides output containing the results of 
access tests performed during the phase. This information is useful to 
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the craft in understanding the final configuration after a phase, and in 
troubleshooting the peripherals removed from service during a phase. 


3.4 A fault recovery example 


This section presents a simple example of recovery from a hardware 
fault by the restructured fault recovery system. This example is 
provided to give a clear understanding of the function of each control 
structure. Figure 10 illustrates the actions taken in this example. A 
single hard (nonintermittent) fault in a duplicated unit is assumed for 
this example. The FRDI and BOOTCNTL structures are not involved in 
this example since they primarily execute on base and phase level, 
respectively. 

A hardware error triggers an F-level interrupt and results in a PUFR 
F-level task being scheduled in pMos. Peripheral Unit Fault Recovery 
performs initialization of internal data structures and collects data at 
the time of the interrupt which reflects the state of the system and the 
interrupting unit. It then schedules (schedule and hold mode) the 
unit’s fault recovery task, passing the data it has gathered as input. 
Peripheral Unit Fault Recovery is suspended until the unit fault 
recovery task completes. 

The unit fault recovery task will attempt to isolate the source of the 
interrupt to a configurable piece of hardware (half of a duplicated unit, 
etc.). The unit fault recovery task will analyze the input data, perform 
access tests, and reconfigure the unit and retry peripheral orders to 
isolate the source of the error. The suspect unit half, fault class (hard 
fault, software fault, intermittent fault, etc.), resolution class (resolved, 
unresolved), and recommended actions are returned to PUFR in a data 
block passed as input. Peripheral Unit Fault Recovery is reactivated 
when the unit fault recovery task is completed. 

It then schedules (schedule and hold mode) the FERA task passing 
the suspect unit, fault class, and resolution class as inputs. The Analysis 
then determines if this is the first interrupt for this unit by consulting 
history data files. A new history data file is allocated, if it is the first 
interrupt. The Analysis updates a history data file, if a previous 
interrupt has been recorded for the suspect unit. It then selects a 
primary strategy based on interrupting unit, number of interrupts, 
fault class, resolution class, and configuration of the unit at the time of 
the error (simplex or duplex). The primary strategy acknowledges the 
action recommended by the unit fault recovery task or specifies an 
alternate action. The primary strategy is not used if the number of 
interrupts which has occurred on this unit is greater than the designed 
limits of a strategy table. The Analysis control selects a secondary 
strategy if this occurs. Otherwise, the secondary strategy is not used if 
a recovery action is specified by the primary strategy. The Analysis 
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Fig. 10—Example fault recovery actions. 


control always executes the analysis strategy independent of the action 
specified in the primary or secondary strategy. The analysis strategy 
views the interrupt in terms of a potential multi-unit hardware inter- 
connection fault. The analysis strategy can overide the action specified 
by the primary or secondary strategy if an interconnection fault is 
suspected. The Analysis returns the recommended recovery action to 
PUFR in a data block passed as input. 

The Peripheral Unit Fault Recovery is again reactivated when FERA 
completes. The recovery actions specified by FERA are then performed. 
The recovery actions may be an immediate action (configuration, etc.) 
or a deferred action (diagnostic, audits, etc.). Deferred actions are 
scheduled for base level execution. Immediate actions are scheduled 
for F-level execution. In the case of an immediate configuration action, 
PUFR schedules a Topic task in a schedule and hold mode. Toll 
Peripheral Configuration validates the configuration request and in- 
terfaces to the specified unit software for the function requested 
(remove, restore, switch, etc.). PUFR is reactivated after the Topic task 
is completed. 

At this point the recovery is completed. The remaining function of 
the Peripheral Unit Fault Recovery is to format and output the data 
collected at the time of the interrupt and the recovery actions taken. 
It then returns to PMOS, completing the F-level processing. 

The Peripheral Maintenance Operations System returns to base 
level processing after it determines that no other F-level tasks are 
scheduled. The deferred actions scheduled for base level are then 
executed. 


3.5 Evaluation 


With the release of the 4E5 generic, the multigeneric plan to develop 
a centralized peripheral fault-recovery-control architecture and a 
maintenance operating system is complete. This new control structure 
provides the fault recovery capability for four new peripheral unit 
types (DIF, PUC, MAS, NcsU).°”’ Four other peripheral unit types (TSI, 
TMS, NCLK, SYSCLK) are partially supported under this system. 

The development cost of the new fault recovery control architecture 
and the unit dependent code for the units listed above has been slightly 
larger than the original estimates. This difference can be partially 
attributed to the introduction of microprocessor technology with these 
units. It can also be said that introducing this new technology in the 
pre-restructured maintenance system would have resulted in even 
larger software development costs for these units. 

The new fault-recovery-control architecture provides a well-docu- 
mented, flexible-control architecture with well-defined interfaces to 
unit-dependent code. With this control architecture in place, new 
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features can be developed with an estimated 30 to 50 percent savings 
in fault recovery software effort. A portion of this savings can be 
attributed to the development methodology and the use of a high-level 
language. It is difficult to estimate the savings contributed by either 
factor. Also, it is not clear that the total benefit of either factor can be 
attained without the other also being present. 

There are also many side benefits in addition to a decrease in 
development costs. Most of these benefits stem from the use of a 
modern development methodology. Improved, up-to-date documen- 
tation is one benefit already mentioned. Others are (z) better project 
visibility through the use of development teams and walkthroughs; (iz) 
a larger base of people with knowledge of specific software modules 
through the use of development teams; and (iii) more software bugs 
found early in the development prior to laboratory testing and field 
release. 

Disadvantages of the new fault recovery control architecture and 
structure design methodology are increased program size and real-time 
usage. Real-time usage in error recovery, even though critical, does 
not significantly affect the system call handling capability as in call 
processing programs. These disadvantages were anticipated, but it was 
unclear what increase could be expected. Initial data indicate that a 
specific function like error recovery, which is real-time critical, has the 
following distribution of real-time usage: 

Operating system—10 percent, 

Control structures—2 percent, 

Macro waits—35 percent, 

Unit code—53 percent. 

Some portion of the operating system, control structures, and unit 
code time can be attributed to the use of a high-level language. 
However, without recoding specific procedures it is difficult to deter- 
mine what percentage is due to the language. Experiments have been 
performed comparing EPLX with the previously used language (EPL). 
In general, EPLX used more real time and program store than EPL. 
However, with some optimization the EPLX program was nearly as 
efficient as the EPL program. This indicates that there is little inherent 
inefficiency in the language. With proper knowledge of the language, 
programs can be optimized for both real time and program store usage. 


IV. SUMMARY 


This paper has described two specific examples which show how the 
No. 4 Ess has evolved through software restructure to better accom- 
modate the addition of new hardware and software features. Call 
Processing and Fault Recovery software underwent varying degrees of 
incremental restructure. These software areas were considered for 
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restructure because many new features were to be added to the No. 4 
Ess which directly affected Call Processing and Fault Recovery. The 
restructuring efforts focused on improving the deficiencies of the pre- 
restructure software and made use of modern development method- 
ologies and a high-level programming language to accomplish the 
objectives. The resultant architectures are heirarchical, much more 
modular, and more easily modified and maintained. We acknowledge 
the effort of those designers too numerous to mention, who contributed 
to the successful Call Processing and Fault Recovery restructuring 
effort. 
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The development plan of the No. 4 Ess included provisions for 
measuring the effectiveness of the design, operation, maintenance, 
and administration of the total system. This paper reviews system 
performance from 1976 to 1980, describes principal factors affecting 
system performance, and presents the service experience measured 
for the No. 4 Ess. Steady improvement has been measured in the 
number of service-affecting incidents experienced per office each 
month. This improvement is also reflected in the rate of cutoff and 
denied calls, as well as in system “no call processing” time. We 
discuss some of the factors influencing this performance record, ¢.g., 
a sound initial design, reliable hardware, effective maintenance and 
repair tools, continuing analysis and resolution of causes of service- 
affecting incidents, and continuing development of new features for 
performance improvement. 


I. INTRODUCTION 


The No. 4 Ess is a digital time-division toll and tandem switching 
system first placed in service in Chicago. It was described in the Bell 
System Technical Journal in 1976.’ Since then, 51 offices with over 
1,000,000 terminations have been put into service. The deployment 
progress is shown in Fig. 1. The average size of the No. 4 Ess is 22,000 
terminations with current office sizes ranging from 6,000 to over 60,000 
terminations. Detailed statistics demonstrate that the No. 4 Ess pro- 
vides high-quality service to its customers and that its performance 
continues to improve as the system matures despite office growth, new 
generic programs, and evolving hardware. 
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Fig. 1—No. 4 Ess deployment. 


Substantial effort has been applied to developing methods and 
procedures for evaluating the performance of hardware and software 
in the No. 4 Ess. Data collection on performance parameters was built 
into the initial design so that performance data from many No. 4 Ess 
systems could be obtained easily and accurately. New performance 
criteria have been developed to measure the effect on the customer 
and to provide data for maintaining the hardware and software. 

A typical No. 4 Ess has intertoll and toll connecting trunks to about 
200 other switching entities. Therefore, because of its size and position 
in the network, its continuous availability for service is needed since 
any malfunction can affect communication in many areas of the 
country. All No. 4 Ess machines are staffed 24 hours a day, 7 days a 
week, and all service-affecting incidents are reported and analyzed. 
Special attention is given to correct the causes of service-affecting 
incidents. 

This paper describes some of the major system objectives, specific 
reliability and maintainability objectives, operational factors affecting 
performance, service experience, and methods used to manage per- 
formance. References 1 through 8 provide additional information on 
system performance. 


Il. SYSTEM OBJECTIVES 


The traditional measure of telephone switching system reliability 
and performance is the amount of “no call processing” time in 40 
system years. This measure is a useful design objective, but it does not 
include all of the effects of complete and partial system failures which 
can lead to unsatisfactory performance from a customer viewpoint. 

The primary objective is to minimize the impact on the customer of 
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all types of system failures. Consequently, cutoff calls and denied calls 
are among the most important performance indicators measured in 
the No. 4 Ess. Many other performance indicators are also measured 
to determine the effectiveness of maintenance and operation so that 
procedures and design problems can be corrected promptly. 

As an example, the derivation of the cutoff call objective for a toll 
call is shown in Fig. 2. Calls are assumed to pass through two local 
offices, two toll offices, and interconnecting transmission facilities. As 
shown, the overall call cutoff objective is less than or equal to 15 calls 
per 10,000, with an allocation to each switching entity of less than or 
equal to 1.25 calls per 10,000. 

Special performance criteria were set for the cutover of the first No. 
4 ESS in Chicago in 1976 (referred to as Chicago 7). They were 
expressed both as objectives and concern thresholds.’ Table I lists the 
objectives. 

Performance objectives have also been set for other performance 
indicators where supporting information is available. However, some 
performance measures are new, and the present, self-imposed, objec- 
tives are based on data obtained from typical No. 4 Ess offices and 
were not part of the original design objectives. The new objectives are 
described later in this paper. 

The design of reliable telephone switching systems involves built-in 
tools to measure performance, as well as reliable hardware, software, 
and equipment configurations. Objectives must be set that are strin- 
gent, yet attainable at a reasonable cost. Objectives for the No. 4 Ess 
performance are based on a reliability model and field data from the 
existing network. Advances in technology and the expectations of the 
public are also considered in setting objectives. 

The ultimate performance of telephone switching systems depends 
on design, as well as installation, operation, and maintenance. Conse- 
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Fig. 2—Allocation of cutoff calls objective in calls per 10,000. The total cutoff call 
objective is less than or equal to 15 calls per 10,000. 
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Table |[—Chicago 7 cutover objectives 


Description Objectives 
Ineffective Attempts <1,25 percent 
Plug-in Replacements <2 per day 
Interrupts <50 per day 
Phases (2 or higher) <1/2 per month 


quently, standards have been developed for final installation accept- 
ance tests, daily equipment performance, and routine maintenance 
procedures. 


iil. RELIABILITY AND MAINTAINABILITY OBJECTIVES 


A primary architectural feature of the No. 4 Ess is the system 
organization and design which provides dependable hardware and a 
software structure that can be operated and maintained by craft 
personnel. These objectives have been accomplished by using reliable 
circuitry and hardware redundancy with extensive supporting soft- 
ware. 

The software design provides centralized maintenance control from 
the 1A Processor. The processor and the peripheral equipment have 
configurable redundancy, which is accomplished automatically without 
affecting service. An automatic backup for the processor semiconduc- 
tor memory is provided by the disk system, which in turn has a 
magnetic tape system backup. A detailed description of circuit relia- 
bility and system redundancy can be found in Ref. 2. 


3.1 Reliability 


The basic element of a reliable system is well-designed hardware 
that includes trouble-detection features and ease of component re- 
placement. The development of the No. 4 Ess is based on a gold metal 
system for semiconductors and their interconnection. The connector 
contacts are also gold plated. The basic design features include open- 
frame convection cooling (rather than fan cooling) and the ability to 
operate in a temperature range of 30°F to 120°F. The hardware is 
designed to make per-frame checks and depends on a centralized 
software maintenance system to automatically reconfigure the hard- 
ware in case of trouble, to diagnose the frame reporting irregularities, 
and to locate the faulty component so it can be replaced by mainte- 
nance personnel. 

A reliability model was developed for the No. 4 Ess to help translate 
service objectives into a redundancy plan and to predict long-term 
performance. The No. 4 Ess reliability model specifies a number of 
hardware failure modes, determines their impact on performance, and 
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predicts their likelihood of occurrence.” The model was derived prin- 
cipally through analysis of predicted hardware failure rates, system 
hardware configurations, and predicted repair times. However, the 
model did not attempt to directly account for the following factors: 
(zt) procedural errors, 
(ii) change activity, 
(tit) growth, 
(wv) retrofits, 
(v) routine exercise, 
(vi) software deficiencies, and 

(vii) hardware design deficiencies. 

Instead, the hardware failure rates predicted by the model were 
scaled to account for procedural errors and software errors based on 
experience gained from previous systems. No provision was made for 
generic program retrofits since their frequency is determined by the 
rate of new feature introduction in each office, which was unknown at 
that time. 

The hardware reliability of the overall system is a function of its 
size, hardware failure rates, redundancy plan, and mean repair times. 
Data taken over a 4-year period show that the predicted hardware 
failure rates essentially have been met. Special repair studies have 
been conducted which show that the mean time to repair solid faults 
is 1.25 hours while the mean time to repair intermittent faults is 20.5 
hours. As shown in Fig. 8, component failures cause only 11 percent of 
the service-affecting incidents. 


3.2 Maintainability 


The No. 4 Ess is designed to perform extensive maintenance func- 
tions automatically so that, problems are rapidly corrected and per- 
sonnel costs are minimized. The initial design provided work centers 
at each office for maintenance and administration. Experience has 
shown that centralized maintenance and administration for up to six 
No. 4 Esss is possible. 

Switching Management Control Centers (smccs) have been imple- 
mented to centralize the maintenance functions. This has led to the 
centralization of expertise, reduction of total maintenance personnel, 
and improved system performance. Additional centralization of Ma- 
chine Administration Centers and Trunk Operations Centers is 
planned for the future.* 

Current field experience demonstrates that the basic system design 
is highly reliable and that craft-level personnel can maintain the 
system. Hardware displays, software support tools, and new mainte- 
nance documentation (task-oriented practices) have contributed sig- 
nificantly to the performance of the maintenance personnel. 
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IV. OPERATIONAL FACTORS AFFECTING PERFORMANCE 


The principal operational factors affecting performance of a No. 4 
ESS are the change and repair activities and some of the environmental 
factors that can affect No. 4 Ess service. Taken together, they represent 
a high level of activity in many offices. Section V presents performance 
statistics which include the service impact of these activities. 


4.1 Variety of system configurations 


One significant factor is the variety of configurations of the No. 4 
ESS. Each installation is engineered to match the service requirements 
of a particular location; therefore, each office is different. This implies 
that fault recognition and system recovery programs must be able to 
operate with any of the possible equipment configurations. 


4.2 Evolution of equipment 


As mentioned earlier, the equipment comprising the No. 4 Ess has 
evolved rapidly and many early offices have added each new type of 
equipment as it became available. The result is a mixture of vintages 
of equipment, complicating the environment in which system integrity 
and fault recognition programs must operate. An example is the first 
No. 4 Ess office, Chicago 7. It has a mixture of core, small (64K) 
semiconductor and large (256K) semiconductor memory frames. Sim- 
ilarly, in its time-division network, Chicago 7 has original vintage Time 
Slot Interchange (TsI) frames, a cost-reduced vintage of Ts1 frames, 
and the present version called the TsI-B. 

Virtually every type of equipment has evolved to incorporate new 
technology since initial introduction: the Digroup Terminal (DT) has 
been cost reduced and replaced with the Digital Interface Frame (DIF), 
the Signal Processor was replaced with the Signal Processor 2 and 
eventually its signal processing function was incorporated into the DIF, 
Common-Channel Interoffice Signaling (ccis) terminals have been 
improved, and the common control echo suppressor was added and 
will be superceded by per-trunk echo cancelers. Figure 3 gives a more 
complete picture of the evolution of No. 4 Ess equipment. 


4.3 Growth activity 


The rate of growth additions to existing No. 4 Ess systems has 
increased steadily. Figure 4 shows the number of major growth jobs in 
progress and the number of new No. 4 Ess offices placed in service 
during each year since 1976. Nearly two-thirds of the operational No. 
4 ESS systems have been expanded with growth jobs. Through the end 
of 1979, growth activity had added over 900 frames of ESS equipment 
and provided over 350,000 new terminations, or nearly one-third of all 
installed No. 4 Ess terminations. Several offices have been expanded 
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Fig. 4—Trend in No. 4 Ess growth activity. 


these incidents have been human error, system software problems, and 
equipment failure in some of the new equipment shortly after it was 
made operational. Some of the improvements made in the growth 
process have been to incorporate temperature stress tests and extra 
network transmission path checks into selected growth procedures to 
improve the reliability of the new equipment once it is made opera- 
tional. 


4.4 Hardware change activity 


Over 400 Change Notices (cNs) have been prepared by Western 
Electric to implement hardware changes in No. 4 Ess equipment. The 
scope of CNs includes wiring changes, circuit pack changes (including 
firmware updates), documentation, and addition of new types of equip- 
ment to existing frames. CNs may be stimulated by design changes 
initiated by Bell Laboratories or by the discovery of manufacturability 
problems discovered by Western Electric. All hardware changes are 
authorized and monitored by the No. 4 Ess hardware change commit- 
tee. The Western Electric Product Engineering Control Center (PECC) 
tracks application of cns in the field. 


4.5 Software change activity 


Software problems account for 25 percent of all No. 4 Ess service- 
affecting incidents. These are problems not detected in laboratory 
system tests or in first application office field tests. Such problems 
may go undetected until the generic program is introduced into an 
office with a particular configuration. Some software problems are 
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caused by incomplete defensive checks and are only stimulated 
through combinations of failures; others are simply design errors. 

Table II shows the size of the No. 4 Ess program with the introduc- 
tion of each new version. The numbers of problems corrected after the 
generic was placed in service are also shown. Although the quality of 
each generic issue is improving, as demonstrated by the decreasing 
number of service affecting incidents per office (Fig. 7), the number of 
field problems fixed has increased for each generic. This is a result of 
greater exposure to different office configurations and the contribution 
of undiscovered problems carried forward from previous generics. 
Generally, these software changes are of two types: the relatively few 
urgent fixes are called out to all offices or transmitted by the Software 
Change Administration and Notification System and installed with 
generic utility overwrites; the remainder are installed only when a 
partial update is distributed to each office. A partial update is a 
technique for introducing large numbers of program corrections with- 
out affecting service. Figure 5 shows a plot of the problems identified, 
fixes under test, and overwrites delivered to the field for the 4E4 
generic. 

One of the major reasons the No. 4 Ess has provided excellent 
service, despite the existence of software problems, is its basic system 
architecture and software integrity design. It is not technically or 
economically feasible to detect and fix all software problems in a 
system as large as the No. 4 Ess. Consequently, a strong emphasis has 
been placed on making it sufficiently tolerant of software errors to 
provide successful operation and fault recovery in an environment 
containing software problems. 

Another type of software change activity involves the office data 
base which includes translations, parameters, trunking, and routing 
information. Occasionally, corrections and changes are made to the 
office data base with standard recent change methods and also with 
generic utility system overwrites. 


4.6 Retrofits 
A major type of software change activity is a generic retrofit in 


Table II—Field problems 


No. of Field 
Total Size Problems 

Generic Service Date (words) Fixed 

4E0 1/76 1160K — 

4E1 7/76 1312 249 

4E2 6/77 1405 352 

43 4/78 1663 434 

4E4 2/79 1736 411* 

4E5 2/80 2162 184* 


* Problems fixed in field as of August 26, 1980. 
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Fig. 5—4E4 generic problem status. 


which each No. 4 Ess replaces its current generic program with the 
latest generic. Current plans call for each office to receive the new 
generic within a year of its official release. Figure 6 shows the number 
of retrofits each year since 1976, indicating a large increase as new 
offices have been added. 

A new office data base is compiled for each retrofit. The data base 
is expanded in anticipation of future growth and also includes a 
recompiled description of the current office data. Other types of 
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Fig. 6—Trend in No. 4 Ess retrofit activity. 
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software changes generally made during retrofits, and also once be- 
tween them (as “midgeneric releases’’), are the introduction of new 
network management software, new trouble-locating procedure tapes 
that help office maintenance personnel locate faulty circuit packs when 
diagnostic tests indicate trouble, and new library programs that contain 
infrequently used test and administrative programs. 


4.7 Rearrangements 


In addition to hardware changes, software changes, growth and 
retrofit activity, office performance can also be affected by major 
rearrangements. Three principal kinds of rearrangements have oc- 
curred. Common-Channel Interoffice Signaling terminals in the first 
28 offices are being rearranged to improve system reliability. This 
involved growth of new terminals and execution of a special library 
program to modify 12 translators in the office data base to effect a new 
terminal pairing arrangement. The second major rearrangement was 
a series of activities to allow one office to serve as a gateway office, a 
function normally planned when an office is first installed. The third 
type of rearrangement performed was to change the pulse point control 
for large numbers of frames in one office to increase its reliability. 


4.8 Repair 


Equipment fails and requires repair on an ongoing basis in No. 4 Ess 
offices. The average circuit-pack replacement rate for the first quarter 
of 1980 was 1.7 per day per office. This is half the rate experienced 
during the first 122 days of Chicago 7 operation in 1976, and it meets 
the short-term objective of less than two per day per office.’ To place 
this number in perspective, a typical No. 4 Ess contains 50,000 circuit 
packs. In a small fraction of cases, office technicians must use oscillo- 
scopes and probe communication buses and backplane wiring to isolate 
equipment faults. Such routine repair of equipment often involves 
several steps, and human error in performing them accounts for 18 
percent of the service-affecting incidents. 


4.9 Other factors 


Although No. 4 Ess offices are well-protected from most external 
factors, some have had an impact on service. In particular, some offices 
have been affected by air-conditioning problems, power-distribution 
failures, failure of non-No. 4 Ess equipment, and static discharge. 


V. SERVICE EXPERIENCE 
5.1 Service-affecting incidents 


To track the performance of No. 4 Ess, the notion of a service- 
affecting incident (or simply, incident) has been defined as those 
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equipment failures and major system recovery actions with a signifi- 
cant effect on the customer. Specifically, they include: 
(t) Hardware failures affecting more than 360 trunks. 
(tt) System recovery directed Phase 1 and Phases 2, 3, and 4. 
(tit) System reinitializations. 


5.1.1 Hardware failures 


Hardware failures affecting more than 360 trunks are called Multiple 
Unit Failures (MuFs). Originally, MUFs represented half the trunks 
served by a Voiceband Interface Frame (vIF). With the addition of 
frames, such as the DIF serving up to 3840 trunks, a MUF is now defined 
as an outage affecting more than 360 trunks. In duplicated equipment, 
duplex failures and/or restoral from them, also cause the recovery 
actions described in Section 5.1.2. 


5.1.2 System recovery 


When the No. 4 Ess must halt call processing to recover from 
problems, the result is called a system recovery phase. In a directed 
Phase 1, all calls associated with a duplex-failed peripheral frame are 
lost; however, the other stable calls in the system are saved. A directed 
Phase 1 can have a duration from 1 to 15 seconds. 

A Phase 2 is used to recover from memory mutilation or peripheral 
configuration problems. It checks the integrity of fixed data, such as 
program store with a hashing algorithm, reconfigures the peripheral 
complex with peripheral bootstrap (when F-level interrupts implicate 
the periphery), and initializes most of the call store memory spectrum 
that is not related to stable calls. A Phase 2 saves stable calls and 
requires less than 30 seconds if peripheral recovery is not required, 
and less than 60 seconds if it is. Calls in the dialing state are lost during 
a Phase 2. 

A Phase 3 is used when a complete processor or peripheral recon- 
figuration is required. It lasts from 1 to 4 minutes, depending on office 
size, and saves stable calls. Calls in the dialing state are lost, as in a 
Phase 2. 

A Phase 4 is similar to a Phase 3, but it is initiated manually and 
disconnects all calls. 


5.1.3 System reinitialization 


A System Reinitialization is a complete reload of the generic pro- 
gram from magnetic tape. It is required only under the most severe 
cases in which data in program store and both file stores are mutilated. 
It can take up to 20 minutes and it disconnects all calls. 


5.1.4 Number of incidents 


When several recovery phases or MUFs are stimulated by the same 
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event, or follow in succession, they are considered a single incident. All 
No. 4 ESS service-affecting incidents are recorded and analyzed. The 
record of these incidents provides an extremely valuable method for 
evaluating system performance and for guiding efforts to improve it. 

Figure 7 is a graph of the number of service-affecting incidents per 
office per month. The trend indicated is a reduction in the average 
number experienced by an office to 1.4 per month during the first 
quarter of 1980. It is significant that a high fraction of service-affecting 
incidents occur in low traffic periods. Over 55 percent of the “no call 
processing” time (see Section 5.3.1) has occurred between midnight 
and 8:00 a.m. to a great extent because routine exercise, complicated 
repairs, change installation, growth activity, retrofits, and other activ- 
ities with high risk are generally scheduled during the periods of lowest 
traffic. 

Each service-affecting incident is classified into one or more of the 
categories shown in Fig. 8. Software design problems account for 25 
percent of the total causes. These problems form the basis of an 
investigation list that is used to guide software current engineering 
effort. The expected category comprises 16 percent of the incidents. 
These are cases in which the system reacted as expected, such as 
planned retrofits, intentional test phases, or when it is impossible to 
resolve a problem to the proper unit of a duplicated pair and the 
system must randomly choose the unit to be removed. Duplex frame 
failures are incidents that occur because a frame is simplex for repair 
and a fault develops in the active controller. They comprise 11 percent 
of the total. Unresolved incidents are 13 percent for which sufficient 
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Fig. 7—Service-affecting incidents. The average for the first quarter of 1980 was 1.4 
incidents per office per month. 
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Fig. 8—Causes of service-affecting incidents, cumulative through March 31, 1980. (a) 
Percent of incidents. (b) Percent of no call processing time. : 


data to thoroughly analyze the source of the incident is unavailable. 
Hardware design incidents are the 4 percent caused by the hardware 
design of a particular frame or subunit. Hardware design problems are 
considered by the No. 4 Ess hardware change committee and fixes are 
scheduled as appropriate. Wiring errors account for 8 percent of the 
incidents and include wiring breaks or loss of insulation integrity as 
well as errors or wire clippings inadvertently left in equipment when 
it was being repaired or modified. The technician error category 
includes operating telephone company craft and Western Electric 
installer errors, and comprise 18 percent of the total. Figure 8 also 
shows the causes for service-affecting incidents by their contribution 
to system no call processing time. 


5.2 Customer impact 


The principal No. 4 Ess performance measures are those that show 
the impact of service-affecting incidents on the customer: cutoff calls 
and denied calls. 

Figure 9 shows the rate of calls cutoff by the No. 4 Ess. The first 
quarter, 1980, rate was 0.18 per 10,000 calls, well below the objective 
of 1.25 per 10,000. Denied calls are the measure of the No. 4 Ess 
contribution to the customer’s ability to complete calls on demand due 
to no call processing time. During the first quarter, 1980, the rate of 
denied calls was 0.28 per 10,000. The trend in the number of calls 
denied by the No. 4 Ess is shown in Fig. 10. The effect on the customer 
of denied calls is difficult to measure, since alternate routing strategies 
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Fig. 9—Cutoff calls. The first quarter 1980 rate was 0.18 per 10,000 calls, well below 
the objective, which was 1.25 per 10,000 calls. 


elsewhere in the network can compensate for some No. 4 Ess denied 
calls, often allowing the customer to complete the intended call. Both 
measures show substantial improvement over the period of time the 
No. 4 Ess has been deployed. 


5.3 System performance 


In addition to cutoff and denied calls, other performance factors are 
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Fig. 10—Denied calls. In the first quarter of 1980, the rate of denied calls was 0.28 per 
10,000 calls. 
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also used to give a more comprehensive measure of system perform- 
ance. They are system- rather than customer-related measures of 
system performance and include: 
(z) no call processing time, 
(iz) trunk outage time, and 
(uit) Ineffective Machine Attempts (IMA). 


5.3.1 ‘‘No call processing’’ time 


No call processing time is often expressed in terms of hours of time 
in 40 years. It includes outage time required for system reinitialization 
such as Phases 2, 3, and 4 and directed Phase 1 recovery actions. Note 
that during the No. 4 Ess no call processing time caused by Phase 2 
and Phase 3, all stable calls continue, unless there is also a duplex 
failure of network or network interface equipment. Figure 11 illustrates 
that the long-term trend has been an improvement in “no call proc- 
essing” time to a first quarter, 1980, rate of 9.9 hours in 40 years. Since 
generic retrofits and data base updates require use of an intentional 
Phase 3 during the lowest traffic periods, there is a built-in requirement 
that approximately 1 hour in 40 years of this total be used for this 
purpose. Customer impact is minimal because network management 
controls applied as part of the retrofit procedure virtually eliminate 
any customer impact. The rate of 9.9 hours in 40 years is comprised of 
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Fig. 11—No call processing. The first-quarter 1980 rate was 9.9 hours in 40 years. The 
objective was 2.0 hours in 40 years. 
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all of the factors shown in Figure 8. It is significant that factors, such 
as procedural errors and software deficiencies, that could not be 
specifically modeled (see Section III), account for nearly two-thirds of 
all downtime. Consequently, the internal objective of 2 hours in 40 
years of total system unavailability is under review. Nevertheless, no 
call processing time has steadily improved as maintenance and relia- 
bility enhancements have been added to the system. 

Figure 12 shows the effect of two recent enhancements. It presents 
overlapping histograms showing the distribution of no call processing 
incidents for two 6-month periods, one ending on March 31, 1979, and 
another ending 1 year later. The significance of the first histogram is 
that is represents No. 4 ESS performance before the directed Phase 1 
feature was available. The directed Phase 1 was introduced in the 4E4 
generic program and has been deployed both in new offices and 
through generic program retrofits. By March 31, 1980, all offices had 
the directed Phase 1 feature. Normally, the directed Phase 1 takes 
about 2 seconds to initialize a duplex-failed Ts1 frame. Prior to the 
directed Phase 1, a 1- to 3-minute Phase 3 was required. The signifi- 
cance of the second histogram is that the directed Phase 1 shifted the 
distribution so that 34 percent of all no call processing incidents require 
less than 30 seconds as compared with 2 percent prior to directed 
Phase 1. An additional enhancement, introduced late in 1979, was a 
shortened Phase 2 when no peripheral equipment was suspected by 
system integrity programs. This also reduces the no call processing 
time. 


5.3.2 Trunk outage time 


Trunk outage time is the measure of hardware failures, such as 
duplex-failed equipment or muFs. Note that no call processing time is 
not included in trunk outage time measurements. Figure 13 shows a 
graph of No. 4 Ess trunk outage time. During the first quarter of 1980, 
the system performance was 38.0 minutes of outage per trunk per year 
compared with an objective of 28.0. Several maintenance enhance- 
ments are planned to help bring No. 4 Ess performance closer to this 
objective. 


5.3.3 Ineffective machine attempts 


Some customer attempts to originate calls result in noncompleted 
calls. The No. 4 Ess has a large and precise ineffective-attempt report- 
ing system that measures call failure statistics and allows an analysis 
of chronic problems. Over 300 call-failure modes are defined, including 
customer errors, failure of switching machines or transmission media 
connected in an incoming mode to the No. 4 Ess, failure of the No. 4 
ESS to establish a cross-office connection, and a failure of the switching 
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Fig. 12—Incident duration for two six-month intervals that show the impact of the 
directed Phase 1 and shortened Phase 2. 


machine or transmission media connected in an outgoing mode to the 
No. 4 Ess. A subset of the total ineffective attempts is classified as an 
IMA. These include calls that must be terminated with incoming, 
connecting or outgoing reorder tone, vacant code announcements, or 
no-circuit tone. 

Figure 14 shows that the average adjusted domestic IMA performance 
has remained relatively constant at a little over 1 percent of all 
attempts. The rate during the first quarter of 1980 was 1.02 percent, 
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Fig. 13—Trunk minutes out of service. For the first quarter of 1980 the system 
performance was 38.0 minutes of outage per trunk per year. The objective was 28.0. 
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Fig. 14—Ineffective machine attempts. The first quarter of 1980 had a rate of 1.02 
percent. The original objective was 1.25 percent. 


meeting the original objective of 1.25 percent. The rate for calls to 
other countries is higher. A study of the specific failures shows that 
the No. 4 Ess and outgoing trunks contribute to less than 0.01 percent 
of the total number of IMAs. Most failures originate from irregularities 
in the incoming network. Further analysis shows that in large metro- 
politan systems, such as those in Chicago and New York City where 
common control Class 5 offices with multifrequency signaling or CcIs 
are used, the reorder component of IMA for domestic calls ranges 
between only 0.2 to 0.3 percent. However, where step-by-step or early 
vintage crossbar switches are used, the reorder IMA ranges between 2 
and 3 percent even though the equipment is properly maintained. This 
can frequently be attributed largely to outside plant problems not 
screened by these systems. The IMA data are effective in identifying 
network problems, and also serve as a continuous check on network 
performance. 


5.4 Interrupts 


One of the most closely watched system maintenance indicators in 
No. 4 Ess is the level of system interrupts. They generally indicate an 
unexpected response from a system action. For example, an equipment 
failure that affects a path through the time-division network may 
cause interrupts. (For a more complete description of system inter- 
rupts, see Refs. 2 and 5.) 

Although interrupts do not directly affect the customer, an objective 
has been set to help manage system maintenance activity. When the 
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interrupt level rises, more attention needs to be spent on maintenance. 
The original empirical interrupt objective of less than 50 per day has 
been tightened to an average of less than 40 per day. Some small 
offices have an objective that is more stringent since they have less 
equipment. The average number of interrupts per office during the 
first quarter of 1980 was less than 25 per day, meeting the objective. 


VI. MANAGING PERFORMANCE 
6.1 Ongoing development 


The original design and implementation of the No. 4 Ess are key 
factors in allowing the system to provide the current level of service. 
However, another key ingredient has been the management of No. 4 
ESS performance. 

Each service-affecting incident is recorded in a data base and anal- 
ysis is performed monthly to track the overall performance. When 
analysis has shown that specific improvements can help improve 
system performance, they become candidates for features to be devel- 
oped as part of the next generic program release. Committees review 
each new feature candidate for its impact on system resources, the 
development effort required, and the feature’s value relative to other 
candidates. The directed Phase 1 was such a feature; it was proposed 
when analysis showed it could reduce system no call processing time. 


6.2 Current engineering 


In addition to new features aimed at improving performance, an 
ongoing effort also exists to identify problems in existing systems and 
to deliver fixes. Specific responsibility for carrying out this effort is 
assigned to a group that works closely with developers to generate the 
necessary fixes. Much of this effort is directed toward the large generic 
program. However, with the rapid introduction of new equipment, all 
modifications to existing hardware designs are also tracked by the 
hardware change committee. 


6.3 Acceptance tests. 


In addition to its basic design, No. 4 Ess performance is affected by 
how well each new system is installed and how in-service systems are 
operated. New systems must meet rigorous operational readiness tests 
and final verification acceptance tests before they are turned over from 
the installer to the operating telephone company (OTC). Before the orc 
places the system in service, it must meet another set of performance 
criteria, of which the 7-day sliding interrupt average is the most visible. 
These performance criteria are specified in Bell System Practices and 
Western Electric Installation Handbooks expressly to help the otcs 
manage the quality of initial service they offer. After initial service, 
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extensive service results performance measurements or indices are 
used to help judge the effectiveness of the team operating each No. 4 
ESS. 


6.4 Managing deployment 


Besides the performance of each individual No. 4 Ess, performance 
management has been extended to help govern the rate at which new 
systems are deployed with new software and hardware. Specific rec- 
ommendations have been published in cooperation with AT&T that 
establish intervals after the first application office for subsequent new 
offices and for the beginning of the generic retrofit program. These 
recommendations limit the initial exposure of new software and hard- 
ware until sufficient experience is gained under actual operating con- 
ditions to allow rapid deployment with confidence that service per- 
formance standards will be maintained. 

The recommendations also specify the composition and duties of 
steering and cutover committees for each new system and major 
growth job. Recent experience indicates that these committees can be 
very effective and are key ingredients in the smooth transition from an 
earlier system to a new No. 4 Ess. 

As indicated in Section III, there are many demands for changes, 
rearrangements, and additions to existing systems. To help manage 
this high level of activity, as well as arbitrating schedule conflicts for 
new systems, retrofits, and data base updates, an Implementation 
Review Committee was formed with representatives knowledgeable in 
oTc needs, Western Electric production and installation capacities, 
and Bell Labs development capabilities and schedules. One of its tasks 
is to help manage peak demands, such as the high fraction of systems 
requesting spring service dates to help meet busy season traffic de- 
mands. 


VII. SUMMARY 


The No. 4 Ess has been incorporated successfully into the Bell 
System and international telecommunications network. Since the 
cutover of the first system in Chicago in January 1976, 51 systems 
terminating over 1,000,000 trunks have been put into service. During 
this period, the hardware and software have evolved to include the 
latest technology which has made possible additional equipment cost 
savings and a reduction in space and power requirements. 

Experience with the No. 4 Ess has confirmed the original design 
criteria for improved reliability and maintainability in stored pro- 
grammed control systems as follows: 

(t) Reliability, maintainability, and administrative features must 
be included in the original architecture of the entire system. 
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(11) Software integrity features are necessary to allow large systems 
to perform successfully in an environment containing software prob- 
lems. 

(zit) Automatic and semiautomatic maintenance aids are mandatory 
for maintaining modern systems. 

(tv) Many factors other than component failures cause reliability 
problems and must be considered in basic design decisions. 

(v) Built-in facilities for continually measuring performance param- 
eters are needed to make sure that performance criteria are met and 
to identify where improvements are required. 

(vi) Performance criteria should be based on customer impact. 

Inclusion of these concepts in the No. 4 Ess has been a major factor 
in its excellent performance and rapid deployment. 
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ACRONYMS AND ABBREVIATIONS 


ADF 

ADS 

ALU 

APUF 
ASW 

AWG 
AUTOVON 
BLM 
BOOTCNTL 
BSRF 
CAMA 
CAROT 

CC 

CCIS 
CCITT 


CESR 
CI 

CLF 
CMOS 
CMP 
CMS 
CPU 
CR 
DCON 
DDD 
DDI 
DIC 
DIF 
DIFRINTR 
DIPs 
DIU 
DMA 
DP 

DT 
DTCRs 
ECL 
EIA 
EIB 
ENSIG 


arranged with data features 

announcement distribution system 

arithmetic logic unit 

autonomous peripheral unit failure 

all seems well 

American Wire Gauge 

automatic voice network 

base level maintenance 

bootstrap control program 

Bell System reference frequency 

centralized automatic message accounting 

centralized automatic reporting on trunks 

control clock 

common channel interoffice signaling 

International Telegraph and Telephone Consultative 
Committee (Comite Consultatif International Tele- 
graphique et Telephonique) 

controller error source register 

control interface 

clear forward 

complementary MOS 

complete 

circuit maintenance system 

central processing unit 

call register 

diagnostic control 

direct distance dialing 

direct data insert 

digital interface controller 

digital interface 

digital interface frame interrupt recovery package 

dual in-line packages 

digital interface unit 

direct memory access 

dial pulse 

digroup terminal 

dedicated time-slot interchange connection registers 

emitter-coupled logic 

Electronic Industries Association 

extended internal buses 

enable signaling 
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LSI 
MAC 
MAPT 
MAR 
MAS 
MAUs 
MCC 
MEDIC 
MF 
MMC 
MOC 
MOS 
MP 
MSC 
MSS 
MSTAT 
MUDS 
MUF 
NCLK 
NCSU 
NMOS 
OGT 
OGTS 
ONAC 
OOF 
OP 
OSC 


electronic programming languages 
electronic programming language extended 
error source register 

electronic switching system 

executive controller 

failure error analysis 

framing and receiving 

frame request and diagnostic interface 
internal bus 

internal bus multiplexer 

integrated circuit 

incoming trunk 

insulated gate field-effect transistor 
ineffective machine attempts 

inward wide area telecommunications service 
input/output processor 

interface to peripheral unit bus | 
large-scale integration 

machine administration center 

mass announcement phasing table 
mass announcement register 

mass announcement system 

mass announcement units 

master control center 

message dispenser and coordinator 
multifrequency 

maintenance microcomputer 
maintenance operations center 

metal oxide semiconductor 
maintenance processor 

media stimulated calling 

mass announcement support system 
mass announcement system status table 
machine updatable data system 
multiple unit failures 

network clock 

network clock synchronization unit 
N-channel metal oxide semiconductor 
outgoing trunk 

outgoing trunk select 

operations network administration center 
out of frame 

operation 

oscillator 
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OTC 
PAS 

PC 

PCM 
PCS 
PCSINH 
PECC 
PIC 
PIDENT 
PMOS 
PROM 
PU 

PUB 
PUC 
PUCB 
PUEAB 
PUF 
PUFR 
PURB 
PUWB 
PWB 
RAM 
RCRRT2 
RCV 
REF 
REG 
REPODISP 
ROM 
RPT 
RPY 
SFPD 
SMCC 
S/P 

SP1 
SYSCLK 
TDN 
TEC 
TMR 
TMS 
TMT 
TOC 
TOPIC 
TR 

TS 


operating telephone company 
public announcement service 
peripheral controller 

pulse-code modulated 

power control switch 

per-channel signaling inhibit 
product engineering control center 
peripheral interface controller 
program identification 

peripheral maintenance operating system 
programmable read only memory 
peripheral unit 

peripheral unit bus 

peripheral unit control (controller) 
peripheral unit control bus 
peripheral unit enable/address bus 
peripheral unit failure 

peripheral unit fault recovery 
peripheral unit reply bus 
peripheral unit write bus 

printed wiring board 

random access memory 

remove recent change 

receive 

reference 

register 

report dispenser 

read only memory 

report scan 

reply 

superframe pattern detector 
switching management control centers 
serial to parallel 

signal processor type 1 

system clock 

time-division network 

terminal equipment center 

trunk maintenance register 

time multiplexed switch 

transmit 

trunk operations center 

toll peripheral configuration 

trunk register 

time slot 
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TSI 
TSI SPC 


TSN 
TSPS 
TTL 
TTY 
TU 
UART 
UMB 
UTE 
VIF 


time-slot interchange 

time-slot-interchange switching and permuting cir- 
cuits 

trunk scanner number 

traffic service position system 

transistor-transistor logic 

teletypewriter 

time of unit 

universal asynchronous receiver transmitter 

unit maintenance 

unitized terminal equipment 

voiceband interface frame 
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